Skip to content

Same Flow, Different Purpose

Many AI workflow frameworks look similar because almost all serious software delivery has the same backbone:

mermaid
flowchart LR
    A[Clarify] --> B[Plan]
    B --> C[Implement]
    C --> D[Test]
    D --> E[Review]
    E --> F[Ship]

That similarity is real. The difference is what each framework treats as the main problem and what artifact it makes authoritative.

The core confusion

When users see:

text
plan -> implement -> review

in every framework, they assume the frameworks are interchangeable.

They are not.

The common flow is the skeleton. The distinguishing factor is the operating model.

QuestionWhy it matters
What is the source of truth?Determines what the agent must obey
What failure is the framework optimized against?Determines when it is useful
Who approves decisions?Determines governance model
How is context preserved?Determines long-running reliability
How is correctness proven?Determines verification strength
How much ceremony is expected?Determines team fit

Same verbs, different ownership

Plan, implement, and review also appear outside workflow frameworks. A harness may plan file edits, an app framework may plan a tool path, an eval loop may review outputs, and a governance workflow may review evidence.

LayerWhat plan meansWhat review means
Workflow/methodologyplan delivery artifacts and task orderreview specs, plans, code evidence, approvals
Agent harness/runtimeplan terminal commands, file edits, tool callsinspect diffs, command output, tests
Agent app frameworkplan runtime path through chains, tools, or graph nodesevaluate output, state transitions, tool trajectory
Evals/observabilityplan test coverage and datasetscompare traces, scores, regressions
Security/governanceplan risk controls and approval boundariesapprove evidence and audit trail

This is why the AI Engineering Stack Map matters. The same verb has different artifacts and different accountability at each layer.

Same verbs, different meaning

VerbSpec KitOpenSpecAWS AI-DLCGSDSuperpowers
PlanTurn spec into implementation planCreate/adjust change artifactsGate lifecycle decisions and construction plansPrepare executable phase planWrite detailed implementation plan
ImplementBuild tasks from specApply change tasksExecute approved construction unitsDispatch tasks, often via subagentsImplement test-first
ReviewCheck spec/plan/task consistencyReview change artifacts before sync/archiveHuman approval and auditVerify phase outputCode review and TDD evidence
ShipComplete feature against specSync/archive change into current specsUpdate state/audit and release readinessShip phase/PR/milestoneFinish branch

The verbs overlap. The contract behind them differs.

What each framework is really optimizing

mermaid
flowchart TB
    A[Framework choice] --> B{Primary optimization}
    B -->|Spec correctness| SK[Spec Kit]
    B -->|Lightweight change specs| OS[OpenSpec]
    B -->|Lifecycle governance| AD[AWS AI-DLC]
    B -->|Execution throughput and context| GSD[GSD]
    B -->|Engineering discipline| SP[Superpowers]
FrameworkPrimary optimizationBest mental model
Spec KitFeature specification correctnessSpec compiler
OpenSpecLightweight iterative change controlChange proposal and delta-spec workspace
AWS AI-DLCGoverned AI delivery lifecycleDelivery governance cockpit
GSDMulti-session, multi-agent executionAgent delivery factory
SuperpowersTest-first engineering behaviorEngineering discipline layer

Source of truth difference

FrameworkSource of truth
Spec KitFeature specs, plans, and tasks
OpenSpecopenspec/specs/ for current behavior; openspec/changes/ for proposed behavior
AWS AI-DLCaidlc-docs/, state, audit, lifecycle artifacts
GSD.planning/ project memory and phase state
SuperpowersApproved plan, tests, review findings, branch state

This is the most important difference. If you know the source of truth, you know how the framework thinks.

Failure mode difference

FrameworkIt prevents...But can fail by...
Spec KitBuilding the wrong feature from vague requirementsCreating polished but incorrect specs
OpenSpecLosing track of proposed changes in chat historyBeing too light for high-risk governance
AWS AI-DLCAI delivery without accountabilityBecoming paperwork if approvals are rubber-stamped
GSDContext collapse and slow multi-task executionAutomating too much before review catches up
SuperpowersCode-first agent behavior without tests/reviewBecoming ritual if tests are weak or skipped

Ceremony spectrum

mermaid
flowchart LR
    A[Low ceremony] --> OS[OpenSpec]
    OS --> SP[Superpowers]
    SP --> SK[Spec Kit]
    SK --> GSD[GSD]
    GSD --> AD[AWS AI-DLC]
    AD --> B[High governance]

This is not a quality ranking. It is a ceremony/governance ranking. Low ceremony can be excellent for speed. High governance can be necessary for risk.

When two frameworks look identical, ask these questions

  1. Does this framework own requirements, execution, governance, or behavior?
  2. Where does it store memory?
  3. What does it do when code and spec disagree?
  4. Does it optimize for clarity, speed, control, or quality?
  5. Does it assume one agent, many agents, or human review boards?
  6. Can I skip steps safely for low-risk work?
  7. What evidence proves "done"?

Quick distinction table

User says...Likely framework
"I need AI to understand the feature correctly before coding."Spec Kit
"I want a lighter spec system that fits existing code and iterative changes."OpenSpec
"I need approvals, audit, NFRs, and human accountability."AWS AI-DLC
"I need AI to keep working across many sessions and parallel tasks."GSD
"I need the agent to stop coding recklessly and use tests/review."Superpowers

The one-sentence distinction

They all contain planning, implementation, and review because all good software delivery does. They differ in what they make authoritative: Spec Kit makes feature specs authoritative, OpenSpec makes current specs and proposed changes authoritative, AI-DLC makes lifecycle approvals authoritative, GSD makes project memory and phase execution authoritative, and Superpowers makes engineering discipline and test evidence authoritative.

Where Hermes fits

Hermes is different again: it is not mainly another plan/implement/review workflow. It is an agent harness/runtime that can execute those workflows.

mermaid
flowchart TB
    H[Hermes Agent runtime] --> SK[Spec Kit workflow]
    H --> OS[OpenSpec workflow]
    H --> AD[AI-DLC governance]
    H --> SP[Superpowers discipline]
    H --> GX[GSD-like execution patterns]

Hermes should not be compared as "Hermes vs Spec Kit" in most cases. The more useful question is:

text
Should Hermes be the runtime that runs Spec Kit/OpenSpec/AI-DLC/Superpowers?
QuestionAnswer
Does Hermes define a source-of-truth artifact model like OpenSpec?Not primarily
Does Hermes define enterprise lifecycle governance like AI-DLC?No
Does Hermes define TDD/review discipline like Superpowers?Not by itself
Does Hermes provide tools, memory, skills, subagents, runtime control?Yes

If workflow frameworks are the operating process, Hermes is the programmable agent machine that can run the process.

Where LangChain and LangGraph fit

LangChain and LangGraph are different from both workflow frameworks and coding agent CLIs. They are used to build AI applications or agent systems.

mermaid
flowchart LR
    LC[LangChain] --> APP[AI app / RAG / tool agent]
    LG[LangGraph] --> ORCH[Stateful agent orchestration]
    H[Hermes] --> CLI[Agent CLI/runtime]
    WF[Spec Kit / OpenSpec / AI-DLC] --> PROCESS[Delivery process]
ToolIt mainly defines
LangChainApp-level model/tool/retriever/agent composition
LangGraphStateful graph orchestration for agent apps
HermesRuntime/harness for running an agent
Spec Kit/OpenSpec/AI-DLCDelivery workflow and artifacts

Do not use LangGraph as a replacement for AI-DLC. LangGraph orchestrates runtime behavior; AI-DLC governs delivery decisions.

Built as a static bilingual AI engineering stack guide.