Why we built voice planning into a task manager
Sometimes you can't type. Sometimes talking is the only way the thoughts come out. That's why Guardian has a voice.
There's a specific kind of executive function failure where you know what you need to do, you can even articulate it out loud, but the act of typing it into a text field feels impossible. The gap between thought and keyboard is just wide enough that the task never makes it into the system.
We heard this from users early on. Not as a feature request — nobody said 'I want voice input.' They said things like: 'I know what I need to do, but I can't make myself type it.' Or: 'I talk through my plans in the car and then forget everything by the time I sit down.' Or, more bluntly: 'Some days my fingers just don't cooperate with my brain.'
This is a real cognitive phenomenon, not a preference. Typing requires a specific chain of executive operations: formulating the thought, translating it to text, deciding on phrasing, operating the keyboard, reading back what you typed. Speech bypasses several of these steps. The path from thought to spoken word is more direct, lower friction, and relies on different neural circuitry than the path from thought to typed text.
So we built voice planning into Guardian, Steady's AI assistant. But we didn't just add dictation — converting speech to text and dumping it into a task field. That solves the wrong problem. The problem isn't that people can't type fast enough. It's that they can't formulate structured plans when executive function is low. Dictation gives you a transcript. Voice planning gives you a conversation.
Here's how it works. You tap the voice button and start talking. Not in task-formatted sentences — just talking. 'I need to deal with the insurance thing, and there's that email from Marcus I've been avoiding, and I should probably eat something, and the report is due Thursday but I don't even know where to start.' Guardian listens, asks clarifying questions, and then proposes a structured plan: three tasks, ordered by what seems most approachable, with the insurance thing broken into a specific first step.
The difference between dictation and conversation-based planning is the difference between a transcriptionist and a thinking partner. Dictation captures what you say. Conversation helps you figure out what you mean. For someone whose thoughts are tangled — and when executive function is low, thoughts are almost always tangled — having an AI that can listen to a five-minute ramble and extract the actual action items is genuinely useful.
We debated this feature internally. Voice in productivity apps often feels gimmicky — a checkbox feature that nobody uses after the first week. What convinced us was watching user sessions where people would stare at the task input field for minutes, type nothing, and close the app. The input mechanism itself was a barrier. Voice didn't just offer convenience. It offered access.
Guardian's voice mode is available at any regulation level, but it adapts its responses based on your current state. At high regulation, it's more like a chief of staff: crisp, efficient, action-oriented. At low regulation, it slows down. Asks simpler questions. Proposes less. Doesn't push. The voice stays the same; the approach shifts. Because the person talking at a 3 needs something fundamentally different from the person talking at an 8.