modified: Thursday 6 July 2023
Repair adventure: A single white LED causing stuck keys on a RN988 keyboard
Quest for the key
A few months back I started getting angry with my dumpster-dive membrane keyboard. It was very worn out and amongst other problems the left shift key was jamming. I’ve always lived off freebie discarded keyboards but my stash had finally run out, so I was forced to consider (shudder) spending money on a new one
To tide me over a friend lent me one of his mechanical keyboards, a 10keyless (no numpad) brown-style unit by Ducky. It was really nice to use, but I sorely missed the numpad (it’s particularly important for 3D modelling in blender) and when I looked at the price to get my own I almost imploded. More than 200 Australian dollars… that’s a decade of $20 keyboards lasting a year each.
It’s possible to get some no-name brands for half of that or less, but much of that market seems to cater for blue-style switches (read: loud and high-pitch plastic clicky noises) and shipping from overseas adds another $20 or so due to the awkward shape and size, especially for full-sized 104 key units.
Eventually I spotted some cheap full-sized mechanical keyboards going on eBay second hand:
It turns out almost nothing in that description was correct.
It certainly looked the part when it arrived:
The Respawn Ninja RN988 is a whitelabel by mwave (an Australian online computer parts retailer). I’m not entirely sure of the OEM, I have not been able to find any matching products with the extra 4th status LED for locking (disabling) the Win key.
After plugging it in I immediately noticed two problems:
1. The switches were definitely not brown-style (tactile bump halfway through being pressed), they were instead red-style (smooth when pressed). Not quite what I wanted but hey for this price I’ll happily try it out.
2. The Pause, Home, End and numpad * keys were permanently lit up and did not work when pressed.
The faulty keys were a serious problem. Every few hours they might fix themselves for a while, but then occasionally spam lots of keyboard input before returning to being broken.
At first I thought this was a macro feature caused by some macros saved into hardware by the last owner. I couldn’t find any documentation about such a feature on the web and the key combos to reset the keyboard firmware didn’t fix the problem. I contacted both the seller (“oh no, please return it”) and mwave (who were very kind to reply, even though I didn’t buy it off them) but neither knew anything about such features.
A few days of headbanging later I realised that this was not a macro feature. The keyboard controller thought that these keys were “stuck down”. This caused their individual LEDs to turn on and their normal operation to cease. The firmware was smart enough to ignore keys that were pressed down since power-on, so I didn’t see keyspam (except occasionally when the keys started working again). The rest of the keyboard worked fine because it is N-key rollover — it couldn’t give a damn that one arm and two kidneys were falling out.
Disassembly and repair
To find all of the screws I had to remove every keycap:
The key mechs were surprisingly Cherry brand:
These cost more than the Kailh’s or Outemu I expected. Maybe this keyboard was a special testing unit or some other small batch? It didn’t look like an owner had replaced them, the solder joints looked too consistent and there would have been damage caused by desoldering that many pins.
The PCB was single-sided and used lots of 0-ohm resistors as jumpers to cross other wires. You can see my permanent marker annotations circling the faulty keys:
Now let’s take a closer look at that PCB. Dear reader: have a good gander and tell me if anything seems unusual before you scroll down any further.
The real mystery is in shown in the photo above. It’s not so much what is there as what is NOT there. Look at the white silkscreen markings next to some of those resistors, what do they say?
D means diode. This keyboard has no diodes anywhere. It’s using resistors instead. That… shouldn’t allow N-key rollover, right?
This keyboard definitely has N-key rollover, I can mash a whole hand on a test site and see everything light up correctly. This shouldn’t be possible without either wiring a unique trace to each switch (cost prohibitive, your microcontroller would need 100’s of pins) or putting a diode on every switch (which this keyboard doesn’t have). The closest this keyboard comes is by using transistors to drive the columns and rows (these can have intrinsic diode-like effects depending on how they are wired) but even then that’s not enough for full N-key rollover.
Except… this keyboard does have a diode on each key. A light emitting diode, through-hole mounted on the keyswitch side of the board. Oh no, they’re not… are they?
I think this keyboard uses the white LEDs behind each key as the diode for N-rollover keypress sensing. I cannot definitively prove that (the way it’s done is too complex for me to properly fathom and measure, as you’ll see later) but it seems like the most sensible explanation.
Key-read and led-control signals interleaved: debugging frustration maximiser
Probing signals on this PCB was a small nightmare. The circuit design saved money by combining the operation of the LEDs and the reading of keyswitches onto the same wires, so you would often see voltage waveforms like this:
The wide pulses are controlling the LEDs. The narrow pulses are sensing the keys. These pulses would move up and down depending on the state of the LEDs and keys. Other wires would get pulled to similar patterns of high and low voltages so that the LEDs would be reverse-biased (off) during the key sensing part of the cycle. It’s not quite as neat as I draw it above, pressing one key would make the whole waveform move and shift slightly, due to various imperfect resistances.
One of my important early discoveries was that all of the faulty keys were on the same “column” wire, even though they were not arranged in the same physical column from the user’s perspective. This gave a common link between all of the dead keys, allowing me to focus just on a few wires. When the problem triggered the whole waveform would get dragged upwards in voltage slightly (causing all of the keys to read as pressed).
Sadly my usual debugging methods (measure, change something, measure again) did not work because of this shared input and output work done on the same wires. Making any small change to aid measurement would break multiple things and cause multiple behaviour changes. Switching some of the column and row driving transistors did not solve anything. All of the traces tested good. In theory everything should be working correctly.
The fault senses my frustration and starts to toy with me
At this point testing became much harder because the problem became more intermittent. The keyboard would run for hours without showing the problem and then suddenly it would return. Several times I thought I found the fix only to be proven wrong the next day.
I did not want to give up and return the keyboard to the seller. I was ill that week and I really wanted to win. After all, how could I let a few bits of copper and semiconductor junction defeat me? I had to assert my dominance, otherwise my other devices might also start misbehaving. And I have a lot of printers.
I tried lots of things to “control” the intermittency of the problem, including:
- Inspecting and touching up suspect solder joints (they were all fine)
- Pointing a heat-gun at the keyboard
- Leaving the keyboard unplugged for a while
- Putting the keyboard in the fridge
- Shining bright lights onto various chips and parts
- Changing the voltage fed to the whole keyboard via a hydra of modified cables (I thought it might be suffering from marginal voltage thresholds)
- Dimming the LEDs. This… actually seemed to work.
What the? Changing how bright the LEDs were on the keyboard… seemed to affect how often the problem occurred? Oh dear.
It wasn’t an exact and predictable science, but dimming the LEDs (Fn+Down) and waiting a few minutes would often cause the problem to disappear. Setting them to max brightness (Fn+Up) would usually cause it to re-appear. Changing the LED modes of the keyboard (all lights on vs all lights off) also sometimes helped.
At this point I realised I was probably dealing with a bad LED. But how and why?
A genius design let down by an under-specified part
LEDs are not meant to be used as diodes.
Two important specs of a diode are:
- Reverse leakage current (how much current passes through it backwards, as long as you stay under the reverse breakdown voltage)
- Reverse breakdown voltage (at what reverse voltage the diode gives up and lets tonnes of current through)
LEDs typically don’t have either of these numbers specified by the manufacturer. LEDs are almost always only ever used in the forward direction and every circuit I’ve seen that potentially subjects them to reverse bias puts a standard diode in series with them.
I eventually found the miscreant by pulling the white LEDs from the affected keys one-by-one. The white LED behind the Pause key seemed to be the agitator. It took a few days of running the keyboard to confirm but indeed pulling this one LED permanently fixed the keyboard.
Interestingly I didn’t have to replace the LED — the Pause key and its column-comrades continued to work fine. Perhaps its N-key rollover performance has been hurt? Or perhaps this particular circuit allows for a certain number of LEDs to be removed without losing performance?
I decided to investigate the guilty LED:
I wired a resistor in series (for safety) and powered it with backwards 5V, then measured the leakage current. It was OK, somewhere in the micro-amps region, which should not have interacted badly with the design of this keyboard (its resistor values were small enough).
I then tried with a resistor in series and backwards 10V. For hours it was fine (the LED blocked the full 10V) but then suddenly its behaviour changed. The LED only dropped 2V or so and much more current was flowing. Aha, jackpot!
The diode was entering a different state. Interestingly this state was sticky — reducing the voltage to 5V and even unplugging it for a few minutes would not always reset it back to normal. It was effectively acting as a single bit of data storage in a two-pin part. Maybe this was a similar effect as to what allows you to make weird single-transistor oscillators with only 2 pins of a backward NPN transistor.
Normally you would not encounter 10V inside a 5V USB device like a keyboard, but I noticed that the voltage waveforms being used to read the keys and control the LEDs were fast enough to cause overshoot during transitions (thin spiky bits in the graphs). This is caused by a combination of factors including trace length (particularly bad in this single-sided PCB design), inductance and (when dealing with multiple driving wires) capacitance. Under a few tests I managed to get the LED to flip at only around –7V and I suspect only a short burst of such a voltage would be needed.
Of course I might be completely wrong. Perhaps removing this LED caused a load on some other part to lessen, avoiding a problem elsewhere. Perhaps several or all of the LEDs on this board are the same. I was not up for pulling more parts out of this (now working) keyboard so I instead tested with some unrelated white LEDs I had in a drawer, but I could not recreate the problem with those. As a result I can’t be 100% certain that my explanation is the correct one, I’ve definitely caught myself out before with bad control tests. If you have other theories then I’d love to hear them.
This is a really cool way of (potentially) saving money on diodes in your keyboard design whilst retaining N-key rollover, but I’m not totally convinced of its overall merit. The PCB still has footprints for diodes and they replaced them with resistors instead, which surely has a similar manufacture cost. Perhaps it was a test to see if such a design much be feasible for future work? As it is if they wanted to eliminate the diode/resistor footprints they would probably have to switch to a two-layer PCB, negating any possible cost savings.
I have not been able to find any info on the web about this key sensing design. Perhaps I just need the right terminology? Is it a unique solution or do lots of keyboard manufacturers use it?
Sadly this design is a PITA to debug. Mixing input and outputs on the same wires is a feedback nightmare, especially when done over a few different related wires connecting to the same LEDs, resistors and switches. Changing or probing even one thing can change several others. A system with a “broken” feedback path (separate inputs and outputs) is so much easier to tinker with and fix.
I wonder how many keyboards were discarded during QC testing or via RMAs because of this problem? Perhaps it only occurs after a few years of aging the LEDs? It will be interesting to see if people start reporting stuck-key issues across a variety of cheap N-key rollower keyboards over the next few years.
EDIT: Good discussions & comments on Hackaday