How Spec‑Driven Development Powers AI Coding in 2026

How specs make AI coding reliable—and redefine the manager's role

EB
Egor Burlakov
6 min read

How Spec‑Driven Development Powers AI Coding in 2026

How specs make AI coding reliable—and redefine the manager's role


From manager back to builder

I spent years as a tech manager, living in documents and meetings while other people wrote the code.

AI coding agents changed that. Suddenly I could open an editor, describe what I wanted, and get working code in minutes. It felt like cheating, but it also felt like coming home. At the same time, it was obvious that this was not “just” a productivity hack; it was a new skill entirely.

If AI continues its current acceleration, the conventional manager—trapped in endless meetings, shuffling tasks, and chasing status updates—won’t survive. Instead, managers will orchestrate hybrid teams of humans and AI agents, pulling them deeper into architecture, specs, and system validation rather than just coordination. This shift is already routine in startups and mid‑sized companies, with big corporates piloting agentic workflows.

This post is about how I’m learning to develop with AI in that world, and why spec‑driven development has become the backbone of my workflow. I will share another post on more technical aspects of my learnings.

Ground rules for AI‑driven development

Before we get to specs, there are a few general rules that make AI development tolerable instead of chaotic.

  1. Keep the agent inside your workflow, not in another window.

    This sounds basic, but if you generate code in a chat tab, then copy‑paste it into your repo, you’re doing manual CI for an automated age. It’s far better to run the agent where the code lives—inside your IDE or directly against the repository—so it can edit files, run tests, and see the full context.

  2. Make the agent run tests, not just write code.

    The loop should be: agent writes a small change, runs unit or integration tests, and only then proposes a diff. That keeps changes small, makes rollbacks trivial, and turns “hallucinations” into failing tests instead of production incidents.

  3. Wire AI into CI/CD, not around it.

    Treat AI like any other contributor whose work must pass the pipeline. Modern teams are using CI to lint specs, validate OpenAPI contracts, run schema checks, and even evaluate LLM behavior before deployment. The important part is that nothing bypasses CI just because it was written by an impressive model.

  4. Work in small, reviewable increments.

    Work in small, reviewable increments on feature branches. The worst pattern is 'generate a whole module, paste it into main, hope it works.' The better one: create a branch, let the agent tackle one function or thin vertical slice at a time, run tests there, review the diff, iterate, then merge via PR.

  5. Continuously evolve the instructions you give the agent.

    If the agent keeps making the same mistake—wrong logging format, missing edge cases, insecure defaults—that’s a signal to update your instructions, not to complain louder next time. Over time you build a kind of “house style” for the agent: a short spec about how your code should look and behave.

These basics already help, but they still leave one big question open: where does the “truth” of the system live? In your head? In a random chat history? In a half‑updated Confluence page? That’s where spec‑driven development comes in.

The spec‑driven development loop

The spec-driven development loop puts the spec at the center and clearly divides labor between humans and AI agents, using precise files to keep everything grounded. I started running the spec-driven loop after too many AI sessions left me with chat histories instead of a system I could reason about. All the work on files happens through prompts to an AI agent—you describe what needs changing, it proposes the diff, you review and commit. Here's how the loop flows in practice, step by step, with humans owning the key decisions and agents doing the mechanical lifting:

  1. Gather requirements. It always begins with messy inputs—product docs, Slack threads, user stories. You prompt an agent to summarize them into a single requirements.md file with clear sections for capabilities, constraints, and open questions. You answer those questions yourself, since only humans know the real tradeoffs.

  2. Update the spec. With requirements clarified, you prompt the agent to create spec.md or update it, pulling in the new requirements. This captures endpoints, events, data models, or acceptance criteria as needed. You review the file—specs are the system's truth, so they commit first, before any code moves.

  3. Plan tasks. Prompt the agent again, feeding it the updated spec; it generates tasks.md, breaking work into small slices like "add login endpoint" or "handle expired tokens," each tied to a spec section with success criteria. You approve the plan, tweaking priorities or scope.

  4. Implement and test. For each task, you prompt the agent with its spec slice and context—it writes code and tests directly into a feature branch, runs them locally, and opens a PR. The agent handles initial testing, but you review the PR (alongside CI) to catch anything tests miss—like logic gaps or security issues—before merging.”

  5. Validate and merge. CI runs spec compliance checks, contract tests, and the full suite. You review the PR diff against the task and spec, merge if it aligns.

  6. Deploy and evolve. CI deploys on merge. Production telemetry or user feedback loops back to requirements.md, and you restart from the top—updating the spec first if behavior needs to change.

Spec-driven development workflow

The beauty is how the spec anchors everything. Agents can't drift because every prompt references it. Humans stay in control of intent without micromanaging lines of code. After a few cycles, your repository turns into a spec-first machine

The new manager: architect of humans and agents

If you run a tech team, your role sharpens to architecting humans and agents into one seamless system.

You integrate them by dividing labor clearly: agents scaffold code, tests, and docs from specs; developers handle architecture, complex logic, and oversight. Agents tackle narrow tasks with escalations routed to humans, all flowing through spec-driven CI/CD gates.

You drive operational excellence—designing processes, choosing agent-friendly stacks (Cursor, spec tools, evals frameworks), and evolving talent toward "AI reliability engineers." Obviously you keep developing your people too, so they master new technologies while deepening their grasp of the business and products they're building. Track agent metrics alongside human impact, and foster culture where AI amplifies talent.

That's your leverage: processes that hum, agents that scale humans, predictable shipping.

Conclusion

Spec-driven development doesn't just make AI coding faster; it makes it reliable, turning agents into extensions of your intent rather than sources of chaos. Start with one spec file in your next project—you'll wonder how you built without it.

EB

Written by Egor Burlakov

Engineering and Science Leader with experience building scalable data infrastructure, data pipelines and science applications. Sharing insights about data tools, architecture patterns, and best practices.

💬 Comments

Leave a Comment

0/5000 characters (minimum 10)

Loading comments...