Debugging a uinput Race Condition on Wayland: Three Days, Thousands of Tokens

The context: Hardening a local Wayland STT tool.
The symptom: Synthetic pasting failed 80% of the time.
The root cause: A Linux uinput device discovery race condition.
The fix: A persistent, in-process virtual keyboard with lazy initialization.
The philosophy: Debugging with AI when your cognitive debt outweighs your mental model.

It all started with a good intention: trying to harden a codebase by inspecting it for architectural weaknesses and anti-patterns.

I built parakeet-stt, a fully local push-to-talk transcription tool for Linux. You hold Right Ctrl, speak, release, and the transcribed text is injected via uinput wherever your cursor is. It has a Rust frontend handling the hotkey and overlay, and a Python daemon running NVIDIA’s NeMo Parakeet model.

Using a custom prompt, I had an AI agent review the codebase. One of the suggestions was to delegate the text injection to a child process. The idea was to spawn a process, let it emulate a virtual keyboard to paste the text, and then die, ensuring the injection was completely decoupled from the rest of the code.

I implemented it. Then I realized pasting had become incredibly unreliable, silently failing about 80% of the time.

I spent the next three days burning through tokens and credits to track down a bug I introduced to solve a problem I didn’t actually have. Here is what I learned about Wayland compositors, Linux uinput, and debugging when the best models out there cannot find the root cause.

The failure mode was specific: plain raw dictation successfully produced transcripts and updated the system clipboard, but the automatic paste into target applications (Ghostty, Brave, Zed) failed silently. Curiously, manual pasting from the clipboard immediately afterward worked perfectly.

When you don’t understand an issue, it’s tempting to guess. I had GPT 5.4 and Amp’s Oracle produce an extensive list of all possible root causes, ranked by probability, with each entry justified for and against based on ground-truth source code evidence. We initially assumed the paste was firing too early after the hotkey release, so I added a 250ms settle delay. It didn’t fix the problem.

Then I suspected focus drift: maybe the child process was routing the synthetic chord to the wrong surface. I added observability to capture child-side focus snapshots, which proved the correct window was actually focused.

The Smoking Gun

After crossing those two off the list, I decided to build an extensive test suite and a paste-gap-matrix script to evaluate the pipeline end-to-end, testing every backend and configuration mode.

The data revealed a massive discrepancy:

Inject-only uinput (no push-to-talk lifecycle): >95% success rate.
PTT-flow uinput (spawning a fresh process per injection): 20% success rate.

The problem wasn’t the uinput backend itself. It was the lifecycle of the virtual keyboard.

When the PTT flow spawned a new subprocess, it created a brand-new /dev/uinput virtual keyboard, immediately emitted the Ctrl+Shift+V chord, and destroyed the device when the process exited.

The Linux uinput documentation explicitly warns about this pattern: after UI_DEV_CREATE, userspace needs time to detect the new device. Wayland compositors like COSMIC and Smithay process device hotplugs by classifying the device and associating it with a seat. If a device is not yet known to any seat, its keyboard events are silently dropped.

By creating and destroying the device almost instantly, I was racing the compositor’s discovery phase. Partial interpretations of the chord during device warm-up also explained why the first and last characters were sometimes truncated.

The Fix

The solution was ripping out the subprocess entirely and moving to a persistent in-process uinput sender.

Instead of creating a device per job, the system now uses lazy initialization:

It creates the /dev/uinput virtual keyboard on the first paste attempt.
It applies a bounded warm-up delay only after fresh creation, to allow the Wayland compositor to discover and seat the device.
It keeps the device alive for subsequent jobs while it remains healthy.

This preserves reliability by keeping the device warm, while still allowing for late /dev/uinput recovery via bounded backoff if device creation fails.

Minimal Mental Models

The timeless lesson here is simple: don’t fix a problem you don’t have yet. I wanted to make the architecture more robust, but I completely missed the second-order consequences of spawning a new subprocess and a new virtual keyboard for every single injection.

But the more interesting takeaway is how this was fixed. My mental model of the system was unaligned with reality, a classic case of cognitive debt. I didn’t fully understand the underlying Wayland input stack, but I pushed through anyway.

I treated the system as a black box and debugged it blindfolded:

List all potential root causes and have a SOTA model assign probability scores.
Focus purely on adding observability and tests, no behavioral changes yet.
Keep a single Markdown file documenting all facts, test results, and evidence without opinion. Keep appending to that canonical file as you dig deeper.
Feed the raw data to the highest-capacity reasoning model available at the start of every session. Implement the single highest-ROI next step, rinse and repeat.

Was it an inefficient waste of tokens? 100%. If I had coded it entirely by hand and maintained a perfect mental model, this bug probably wouldn’t have existed in the first place.

But we are watching a paradigm shift. Hand-crafted code and maintaining perfect systemic comprehension are becoming like horse riding: a charming means of transportation that is no longer strictly necessary. The real skill is learning how to leverage agentic engineering to systematically debug and stabilize systems without incurring the cognitive tax traditionally associated with that.

View the code on GitHub