🇨🇳 中文

AI Development Methodologies Compared: From Vibe Coding to SDD

Deep comparison of 6 AI development methodologies — Vibe Coding, SDD, BMAD, Ralph Loop, and pragmatic hybrid workflows. With Martin Fowler critique and practical recommendations.

Bruce

AI CodingSpec-Driven DevelopmentVibe CodingBMAD MethodAI Workflow

AI Guides

2594  Words

2026-03-11 10:00 +0000


AI development methodologies comparison — Vibe Coding, SDD, BMAD, and pragmatic workflows

In early 2025, Andrej Karpathy coined “Vibe Coding” and the AI-assisted development revolution began. By Y Combinator Winter 2025, 25% of companies had 95% of their code generated by AI. But the honeymoon didn’t last — quality issues, technical debt, and project chaos forced the industry to rethink how humans and AI should collaborate on code.

This article is a deep dive into the six major AI development methodologies that emerged from this reckoning. I’ll break down what each one gets right, what it gets wrong, and — most importantly — which one you should actually use. This isn’t a surface-level overview; it’s built from hands-on experience, Martin Fowler’s team analysis, Peter Steinberger’s evolving workflow, and real production data.

The Problem: Why We Need Methodologies at All

Before 2025, most developers used AI as a fancy autocomplete. Type a comment, get a suggestion, accept or reject. Simple.

Then came agentic coding — AI that could plan, execute multi-step tasks, write entire features. Suddenly, the bottleneck shifted. The hard part wasn’t writing code anymore; it was telling the AI what to write, and making sure it actually did what you wanted.

Think of it like hiring a team of incredibly fast but occasionally confused junior developers. They can write code at superhuman speed, but without clear direction, they’ll build the wrong thing — beautifully, confidently, and at scale.

That’s why methodologies matter. They’re frameworks for giving AI the right context, the right constraints, and the right feedback loops.

The Six Methodologies

1. Vibe Coding — The Wild West

Origin: Andrej Karpathy, February 2025

Vibe Coding is the simplest approach: just talk to your AI and see what happens. No planning, no specs, no formal process. You describe what you want, the AI generates code, you look at the result, and you iterate through conversation.

Idea → Chat with AI → Code → Check result → Keep chatting

When it works: Quick prototypes, one-off scripts, exploratory hacking, personal toy projects. If you’re building something you’ll throw away next week, Vibe Coding is perfect.

When it breaks: The moment your project grows beyond a few hundred lines. Without structure, AI tends to accumulate contradictions — fixing one bug while introducing two new ones. Vibe Coding feels magical for the first hour and painful by hour ten.

The core problem: Vibe Coding treats code as disposable. That’s fine when it is disposable. But most professional software isn’t.

Think of Vibe Coding as cooking without a recipe. Great for scrambled eggs. Terrible for a wedding cake.

2. Spec-Driven Development (SDD) — The Pendulum Swings

Champions: AWS Kiro, GitHub Spec-Kit, ThoughtWorks

SDD is the opposite extreme: write comprehensive specifications before any code exists, then let AI generate code from those specs. The spec is the source of truth; code is just a byproduct.

SDD operates at three levels, as defined by Martin Fowler’s team:

LevelMeaningWhat humans edit
Spec-firstSpec guides initial development, then discardedCode
Spec-anchoredSpec lives on, updated as features evolveSpec + Code
Spec-as-sourceSpec is the only source of truth, code fully generatedOnly spec

A typical SDD workflow (Kiro-style) looks like this:

Requirements (user stories + Given/When/Then acceptance criteria)
Design (architecture diagrams + API contracts + data models)
Tasks (atomic implementation steps)
Implementation (AI writes code task-by-task)

Each module produces three files: requirements.mddesign.mdtasks.md.

When it works: Large projects with clear requirements, multi-person teams needing audit trails, complex business logic where ambiguity is dangerous.

When it breaks: Almost everywhere else. And the problems are serious enough that Martin Fowler’s team devoted significant analysis to them.

Martin Fowler’s Critique of SDD

After hands-on evaluation of SDD tools, Fowler’s team identified six systemic issues:

  1. Rigid workflows: One fixed process can’t fit all problem sizes. A typo fix and an architecture redesign shouldn’t go through the same ceremony. In their testing, a small bug fix got expanded into 4 user stories with 16 acceptance criteria — “like using a sledgehammer to crack a nut.”

  2. Review fatigue: Teams found that reviewing mountains of markdown specs was worse than reviewing code. One engineer said: “I’d rather review code than review these markdown files.”

  3. Fragile control: Even with carefully designed prompts and large context windows, AI agents frequently ignored or over-interpreted specifications. The spec said one thing; the code did another.

  4. Spec drift: Over time, specs and code inevitably diverge. Keeping them synchronized becomes a full-time job that nobody wants.

  5. Unclear audience: Who is SDD for? Full-stack developers? Product-developer pairs? Domain experts? The methodology doesn’t have a clear answer.

  6. History repeating: SDD bears a striking resemblance to Model-Driven Development (MDD) from the 2000s — the same promise of generating code from higher-level abstractions, the same brittleness when real-world complexity intrudes. MDD ultimately failed. SDD faces the same risks.

3. Peter Steinberger’s Workflow — The Pragmatic Middle Ground

Creator: Peter Steinberger (OpenClaw author, former PSPDFKit founder, later joined OpenAI)

Peter’s approach is perhaps the most interesting because it evolved through failure. He started as an SDD true believer and ended up as one of its sharpest critics.

Early 2025 — Full SDD:

Voice ideation → AI drafts spec → AI reviews spec → Iterate → Implement

Mid 2025 — Disillusionment:

Chat directly with AI → Build features together → Write specs only for complex tasks

Current 2026 — Multi-agent + Lean docs:

Lean AGENTS.md (core rules)
  + docs/ with frontmatter index (AI reads on demand)
  + 3-8 agents working in parallel
  + Atomic git commits

The Frontmatter Index Innovation

Peter’s key innovation is a document indexing system using YAML frontmatter. Instead of stuffing every document into the AI’s context window, each doc gets metadata headers:

---
summary: "Model authentication: OAuth, API keys, and setup-token"
read_when:
  - Debugging model auth or OAuth expiry
  - Documenting authentication or credential storage
title: "Authentication"
---

When an AI agent starts up, it runs pnpm run docs:list to scan all frontmatter and build an index. It only reads full documents when needed — not everything upfront.

This is elegant because context windows are scarce resources. Cramming in every possible document actually degrades AI performance. Less is more.

Peter’s Core Principles

His philosophy can be distilled into four rules:

  • “No more writing full specs — just talk to it and build features together” — For most tasks, interactive iteration beats spec writing.
  • “Keep AGENTS.md lean. logs: axiom or vercel cli — one line is enough” — Concise, focused context beats comprehensive documentation.
  • “No MCP — use CLI tools instead” — Simple, direct CLI commands (Vercel, psql, gh, axiom) are more reliable than complex integrations.
  • "~20% of time goes to fully agent-driven refactoring" — AI excels at large-scale mechanical refactoring. Let it run autonomously for that.

4. BMAD Method — The Enterprise Aircraft Carrier

Scale: The most comprehensive SDD framework. 21 specialized AI agents. 50+ guided workflows.

BMAD (Breakthrough Method for Agile AI-Driven Development) simulates an entire product team with AI:

/analyst → Product brief
/pm → PRD (Product Requirements Document)
/architect → Architecture design + User stories
/dev → Implementation

Each stage is handled by a different AI “persona” with specific expertise, responsibilities, and constraints.

Real-world case study: A 3-person team used BMAD to convert a 50,000 LOC COBOL system to Java Spring Boot, reducing integration time by 40%. Production teams report delivering 2.7x faster with 75% fewer bugs.

When it works: Enterprise-scale projects, legacy system migrations, teams that need compliance audits and governance trails.

When it doesn’t: Anything smaller than “enterprise.” If your project doesn’t justify 21 AI agent roles and 50+ workflows, BMAD is overkill. The learning curve is steep, and the coordination overhead of managing that many agent personas can exceed the time saved.

BMAD is a battleship. Perfect for crossing oceans. Absurd for fishing in a pond.

5. Ralph Wiggum Loop — Embracing the Fresh Start

Creator: Geoffrey Huntley

Named after the lovably oblivious Simpsons character, the Ralph Loop takes a radically different approach: every AI session starts completely fresh, with git serving as the persistence layer.

Start new agent process (clean context)
Read spec + task list from disk
Pick a task
Implement + git commit
Exit
(Loop)

Control tokens manage the flow:

  • CONTINUE: Keep working on the current task
  • SEND: <message>: Send a message to the human
  • RESTART: Abandon progress and start over

The key insight is that context window pollution is a real problem. As AI agents work longer, they accumulate stale context, contradictory instructions, and cognitive drift. By starting fresh each iteration but persisting progress through files and git commits, Ralph avoids this entirely.

When it works: Long-running automated tasks, projects where context windows overflow, scenarios where you want AI to run autonomously overnight.

When it doesn’t: Tasks requiring deep contextual understanding that can’t be externalized to files. Each restart loses nuanced session context.

6. Context-Driven Development (CDD) — The Philosophy

CDD is less a specific framework and more a guiding principle: focus on giving AI the right context, not on writing the right spec.

DimensionSDDCDD
FocusWhat spec to writeWhat context to provide
InvestmentUpfront documentationOrganizing codebase and references
FlexibilityFixed processAdaptive

CDD practitioners focus on: codebase understanding, error messages, screenshots, relevant code snippets — whatever helps the AI understand the current situation rather than following a predetermined plan.

This aligns with the emerging field of context engineering, which treats the information environment around AI as the primary lever for quality output.

SDD Tool Landscape

If you decide SDD is right for your project, here’s how the tools compare:

AWS Kiro

Requirements → Design → Tasks (three stages)
  • Engine: Claude Sonnet
  • Format: 3 markdown files per module
  • Special: EARS notation for requirements, “steering” memory banks
  • Problem: Even trivial bugs get the full three-file treatment
  • Verdict: Good for teams already in the AWS ecosystem. Read our Kiro review for a deeper analysis.

GitHub Spec-Kit

Constitution → Specify → Plan → Tasks (four-stage cycle)
  • Distribution: Open-source CLI tool
  • Special: Slash commands (/specify), immutable “Constitution” rules file
  • Problem: Generated markdown tends to be verbose and repetitive

Tessl Framework

Spec → Generate Code → Test (spec-as-source)
  • Status: Private beta
  • Special: Code marked as // GENERATED FROM SPEC - DO NOT EDIT
  • Problem: Non-deterministic code generation from identical specs — same spec, different code each time
  • Verdict: Called “MDD’s rigidity + non-determinism” by critics

cc-sdd

Kiro-style commands, supports 7+ AI tools
  • Origin: Japanese development team
  • Special: Interchangeable with Kiro specs, supports Claude Code, Codex, Cursor, Gemini CLI
  • Verdict: Best tool-agnostic option if you want SDD without vendor lock-in

Comprehensive Comparison

DimensionVibe CodingSDD (strict)Peter’s WorkflowBMADRalph Loop
Learning curveNoneMediumLowHighMedium
Upfront costNoneHigh (write specs)Low (lean docs)Very highMedium (task lists)
FlexibilityHighestLowestHighLowMedium
Quality guaranteeNoneHighMedium-HighVery highMedium
Best scaleSmallLargeMediumVery largeMedium-Large
Team collaborationPoorGoodMediumExcellentMedium
Tool dependencyNoneKiro/Spec-KitAGENTS.md + frontmatterBMAD frameworkRalph scripts
Current momentumCooling downHot but controversialRespected in practitioner circlesNiche but deep adoptionNiche

The Pragmatic Hybrid: What Actually Works

After studying all six methodologies, here’s what I recommend — a hybrid approach that takes the best from each:

Match Methodology to Task Complexity

Low complexity (bug fix, small feature)
  → Vibe Coding or interactive iteration. No spec needed.

Medium complexity (new module, new feature)
  → Peter's workflow: lean AGENTS.md + docs/ frontmatter index + write spec only when needed

High complexity (architecture design, system refactoring)
  → SDD: write requirements + design + tasks, but don't be dogmatic

Enterprise-scale projects
  → BMAD or customized SDD process

The Five Pillars of Hybrid AI Development

  1. Keep AGENTS.md lean (Peter-style) — Core rules + tech stack + constraints, under 200 lines. Your CLAUDE.md or AGENTS.md should be a cheat sheet, not an encyclopedia.

  2. Add frontmatter to docs/ (OpenClaw-style) — summary + read_when headers so AI reads documents on demand, not all at once.

  3. Write specs only for complex features (SDD-style) — Not every task needs the three-file ceremony. Reserve it for features where ambiguity could cause real damage.

  4. Atomic git commits (Ralph-style) — Each change is independent and rollable. This is crucial when multiple agents work in parallel.

  5. Interactive iteration as default (Peter’s current practice) — For most tasks, just talk to your AI. Build features through conversation. Drop into formal processes only when complexity demands it.

project/
├── AGENTS.md              # Lean project directives (universal for all AI tools)
├── docs/
   ├── steering/          # Rarely-changed "constitution" docs (with frontmatter)
   ├── specs/             # On-demand feature specs (with frontmatter)
   └── research/          # Research notes and decision records (with frontmatter)

Every document gets a frontmatter header:

---
summary: "One sentence explaining what this document is for"
read_when:
  - When should the AI read this document
title: "Document Title"
---

Key Takeaways

  1. There is no one-size-fits-all methodology. Anyone selling a single approach for all situations is selling snake oil. Match your process to your problem size.

  2. SDD’s promise exceeds current tool maturity. Martin Fowler’s team found that AI agents frequently ignore specs, generating an illusion of control rather than actual control. The tooling will improve, but we’re not there yet.

  3. Less context is often more. Peter Steinberger’s biggest insight: stuffing more information into the AI’s context window reduces quality. Focused, relevant context beats comprehensive documentation.

  4. Git is the real persistence layer. Whether you use Ralph Loops or multi-agent workflows, git commits are how you maintain sanity. High-frequency atomic commits are not optional — they’re essential.

  5. The best methodology is the one you’ll actually follow. A perfect SDD process that your team ignores is worse than informal Vibe Coding that your team actually uses. Start simple, add structure as pain points emerge.

FAQ

Q: Should I start with Vibe Coding or SDD?

Start with Vibe Coding for learning and prototyping, then graduate to a hybrid approach as your project grows. Jumping straight to full SDD is like buying enterprise software for a side project.

Q: Is SDD going to replace traditional development?

Not in its current form. The tools are too rigid and the AI compliance is too unreliable. But the principle — being explicit about what you want before asking AI to build it — is sound and will persist in some form.

Q: How many agents should I run in parallel?

Peter Steinberger’s experience suggests 3-8 agents is the sweet spot. Below 3, you’re underutilizing. Above 8, coordination overhead exceeds the gains. For focused refactoring work, 4 agents is ideal.

Q: Can I mix methodologies within the same project?

Absolutely — and you should. Use Vibe Coding for quick fixes, Peter’s workflow for features, and SDD for complex architectural work. The hybrid approach described above is designed exactly for this.

References

Comments

Join the discussion — requires a GitHub account