Talking to Code: The Most Underrated Shift in How We Build Software
The best engineers at Kombo aren’t typing anymore. They’re talking. And I think this changes everything.
Last week I watched our CTO Aike sit down at his desk, open Cursor and spend the next 40 minutes speaking to an AI agent. Not typing instructions. Talking. Describing what he wanted, getting back questions, refining his thinking out loud. By the end, a feature was built. See the picture below for his setup.
I’ve been writing about vibe coding and agentic engineering for a while now. But this moment hit different. Because what I was watching wasn’t just AI-assisted development. It was a fundamentally different interface between human intent and working software. And I don’t think enough people are taking it seriously
The physical bottleneck nobody talks about
For decades, the real bottleneck in software development wasn’t intelligence. It wasn’t creativity or problem-solving ability. It was physical.
Your brain generates ideas. Your fingers type them. That gap, between thought and code, has shaped everything about how software gets built. The rituals around “deep work” and “flow state.” The frustration when you’re interrupted mid-thought.
We never questioned this constraint. It was just the cost of building things.
Talking to your code changes the equation entirely. Stanford and Baidu researchers measured this in 2016: speech is 3x faster than typing, with 20% fewer errors. And that’s just the raw numbers. The cognitive reality is more interesting. Speaking and thinking happen largely in parallel. Typing is serial since your hands can only keep up with your brain if you deliberately slow your brain down.
The moment you switch to voice, you can stream ideas at the speed you think. Provide full context in seconds instead of minutes of typing. The feedback loop gets so tight it stops feeling like programming and starts feeling like thinking out loud.
I’ve seen this with Aike & Niklas at Kombo and I’ve felt it myself. There’s a point where the interface stops being visible. You stop thinking “I am telling the AI what to do” and start just... thinking. The gap between idea and implementation collapses to almost nothing. We’re not at Neuralink yet. But this is the logical step before that.
The tools are already here
Andrej Karpathy’s original “vibe coding” tweet from February 2025 got 4.5 million views. Most people focused on the “forget that the code even exists” part. But buried in there was something equally significant: “I just talk to Composer with SuperWhisper so I barely even touch the keyboard.”
That wasn’t incidental. That was the workflow.
SuperWhisper (or any other speech-to-text tool) runs in the background on your Mac, using Whisper to transcribe locally. Hold a hotkey, speak, release and whatever you said appears in Cursor, Claude Code, or whatever terminal is focused. There’s also Wispr Flow, which has native IDE extensions for Cursor and Windsurf specifically. Developers using it report dictating at 175+ words per minute. For context, a fast typist does 65.
The pattern that’s emerging looks like this: voice tool in the background, AI coding agent in the foreground, the developer as the person directing traffic. You speak intent, the agent translates to implementation, you speak corrections, the agent adjusts. Back and forth, no keyboard required.
Most engineers I talk to haven’t tried this yet. They know about Cursor. They’ve maybe used Claude Code. But they’re still typing everything, treating AI coding agents like a smarter autocomplete rather than a conversational partner you can actually speak to.
One engineer, five things happening at once
Voice is one part of this. The other part is parallelism. A year ago, the mental model of AI-assisted development was: you and one AI agent, taking turns. You write a prompt, it responds, you review, you follow up. Essentially a chat interface bolted onto your codebase. That model is already obsolete.
Boris Cherny, who created Claude Code, runs 10 to 20 agents in parallel at any given time. Earlier this year, Anthropic’s engineering team built a production-quality C compiler using 16 Claude agents running simultaneously, each with a specialized role, coordinating with each other, producing 100,000 lines of Rust that can compile Linux across three chip architectures. Two weeks of work. Sixteen agents.
Cursor 2.0 was redesigned around this idea. You can delegate tasks to eight different agents at once. Claude Code has an experimental agent teams feature. Open-source tools like Claude Squad let you spin up multiple instances across git worktrees.
What this means practically: when you combine voice input with parallel agents, you become something closer to a director than a developer. You describe what you want, in conversation, while several agents are already working on different parts of the problem. You’re not blocked waiting for one agent to finish before starting the next thing. You’re orchestrating. Checking in. Redirecting. Thinking about the next problem.
The feedback loop isn’t just tight, it’s concurrent.
This is not vibe coding. It’s something more serious.
I want to be careful here, because there’s a version of this story that’s naive. The “anyone can build anything now” take. Just talk to the AI and ship. That’s not what I’m describing.
Karpathy himself made this distinction in early 2026, proposing “agentic engineering” as the professional evolution. The framing: you are not writing code directly 99% of the time, you are orchestrating agents who do and acting as oversight. That’s meaningfully different from just vibing your way to a codebase that technically runs.
Addy Osmani from Google’s Chrome team put it well: the spec is not a prompt anymore. The spec is the product thinking made explicit. When you’re directing 10 agents in parallel, vague thinking doesn’t slow you down, it multiplies. The quality of your output becomes almost entirely a function of the quality of your specification.
What’s changing is not whether engineering skill matters. It’s which engineering skills matter most.
Raw coding speed becomes irrelevant. Syntax memorization becomes irrelevant. The ability to hold the architecture in your head, think clearly about system boundaries, anticipate failure modes, evaluate whether what the agent built is actually what you meant becomes everything. The METR research lab ran a controlled study earlier this year and found experienced developers were actually 19% slower when using text-based AI tools, because the constant stop-prompt-wait-review cycle broke their flow state. Voice interaction is a direct answer to that problem. You stay in conversation. The flow state survives.
Gergely Orosz, writing in The Pragmatic Engineer at the end of last year, described his experience: any time he now has to type precise syntax by hand, it feels like a tedious chore. “My biggest problem now is coming up with enough worthwhile ideas to fully leverage the productivity boost.”
That sentence is the tell. The constraint has moved. The bottleneck is no longer your hands. It’s your thinking.
Where this goes
For 50 years, programming has forced humans to enter the machine’s environment. Learn its syntax. Match its precision. Adapt to its constraints. Every abstraction layer, from punch cards to command line to GUI to IDE reduced that burden somewhat. But we’ve always been meeting the machine more than halfway. Voice + AI agents is the first interface where the machine genuinely comes to us.
J.C.R. Licklider predicted this in 1960, in his paper “Man-Computer Symbiosis.” He spent time studying his own work habits and concluded that most time was lost on “clerical” tasks e.g. getting into position to think rather than actual thinking. He explicitly called for automatic speech recognition as a prerequisite for effective human-computer partnership. Sixty-five years later, we built the thing he was describing.
The gap between “idea in your head” and “implementation” is collapsing fast.
Every abstraction layer that made programming more accessible has created more programmers, not fewer. FORTRAN reduced programming effort by 10x in the 1950s and the U.S. tech workforce grew from 200,000 to 1.6 million in the decades that followed. Now we are seeing for the first time, that AI could actually reduce the number of engineers that we need.
What I’m excited about is that it removes the translation tax. The physical friction that has always stood between thinking and building. The part where a good idea dies somewhere between your brain and the keyboard.
Watch what the best engineers do when that friction disappears. Watch what they build when the bottleneck is no longer their hands, but their thinking. That’s the shift worth paying attention to.
Sources & Further Reading
Andrej Karpathy, “vibe coding” tweet, February 6, 2025 — x.com
Andrej Karpathy, one-year retrospective on vibe coding, February 2026 — x.com
Ruan et al., “Speech Is 3x Faster than Typing for English and Mandarin Text Entry on Mobile Devices,” Stanford HCI Group / Baidu, 2016 — hci.stanford.edu
METR, “Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity,” July 2025 — metr.org
Addy Osmani, “The Factory Model: How Coding Agents Changed Software Engineering” — addyosmani.com
Addy Osmani, “The future of agentic coding: conductors to orchestrators” — addyosmani.com
Anthropic Engineering, “Building a C compiler with a team of parallel Claudes,” February 2026 — anthropic.com
Gergely Orosz, “When AI writes almost all code, what happens to software engineering?” The Pragmatic Engineer — newsletter.pragmaticengineer.com
J.C.R. Licklider, “Man-Computer Symbiosis,” 1960
Mark Weiser, “The Computer for the 21st Century,” Scientific American, 1991 — calmtech.com
JetBrains State of Developer Ecosystem 2025 — blog.jetbrains.com
DX AI-Assisted Engineering Q4 Impact Report 2025 — getdx.com
Wispr Flow, “Vibe coding supercharged with voice” — wisprflow.ai
Mathias Klenk, “Thriving as an Engineer in the Era of Vibe Coding,” TechFounderStack — techfounderstack.com
Mathias Klenk, “Agentic Engineering: How AI Agents Are Reshaping Software Development,” TechFounderStack — techfounderstack.com



