From finding to pull request.
14 stages. One command.
A bug report goes in. A pull request comes out. 14 stages in between, zero prompts. Does it always produce a merge-ready PR? No. But it runs end to end, and it shows where autonomous coding agents actually break down.
> /plugin install ai-architect-feedback-loop
Requires macOS 14+ with Apple Silicon (M1/M2/M3/M4)
What you get
$ /run-pipeline [01/14] Parse Findings ............. PASS 1m 12s [02/14] Impact Analysis ............ PASS 2m 48s [03/14] Integration Design ......... PASS 3m 19s [04/14] Plan Deliberation .......... Apple Intelligence only [05/14] PRD Generation ............. PASS 18m 34s [06/14] PRD Review ................. PASS 4m 11s [07/14] Implementation ............. PASS 27m 03s [08/14] Drift Reconciliation ....... Apple Intelligence only [09/14] Agreement .................. Apple Intelligence only [10/14] Quality Gates .............. PASS 2m 46s [11/14] Semantic Verification ...... PASS 5m 22s [12/14] Benchmark .................. PASS 1m 15s [13/14] Deployment Simulation ...... PASS 1m 41s [14/14] Pull Request ............... PASS 1m 08s Pipeline complete. PR #142 created. 12 files changed, 1,847 insertions(+), 203 deletions(-)
What this pipeline does
It takes a finding and tries to ship a pull request without asking you anything.
Give it a bug report, a feature request, a research paper, or any actionable input. The pipeline parses it, scores the impact, designs the integration, generates a verified PRD, implements the code, runs your test suite, verifies the implementation against the spec, benchmarks the result, and opens a pull request. One command: /run-pipeline.
It is not magic. The pipeline is 14 sequential stages, each with defined inputs, outputs, and failure modes. When a stage fails, it retries with the failure context — up to 3 times — before moving on. 67 deterministic rules enforce output quality. A semantic verifier reads the diff against the PRD without having written a single line of the code it is checking.
It works with Python, TypeScript, Go, Rust, Java, Kotlin, Swift, and more — through configuration, not code changes. It is a PoC that shows the architecture for autonomous development. The architecture is there to be refined.
Get Started in 2 Commands
Install, run. The pipeline handles the rest.
Install the plugin
> /plugin marketplace add cdeust/ai-architect-feedback-loop
> /plugin install ai-architect-feedback-loop
Start the pipeline
> /run-pipeline
The pipeline discovers findings, analyzes impact, generates a PRD, implements code, verifies the result, and delivers a pull request.
14 stages, each with defined inputs and outputs
Each stage runs autonomously and produces artifacts you can inspect. When a stage fails, the pipeline retries with the failure context.
01 Stage 1 — Parse Findings
Reads input — bug reports, feature requests, security advisories, research papers — filters by relevance category and score threshold, and produces a ranked list prioritized by multi-module impact.
02 Stage 2 — Impact Analysis
Computes a compound impact score across four dimensions: modules affected, propagation depth, contract impact, and test coverage delta. Findings must score above threshold and affect multiple modules to proceed.
03 Stage 3 — Integration Design
Designs architectural modifications respecting the target product's patterns. Enforces design principles: parameterization, centralization, composability, backward compatibility. Validates that all referenced files exist.
04 Stage 4 — Plan Deliberation
On macOS with Apple Intelligence, deliberates on the integration plan using on-device foundation models. Adds a second perspective before committing to the design. Skipped in CLI/Docker mode.
05 Stage 5 — PRD Generation
Invokes the AI PRD Generator to produce 9 verified files: overview, requirements, technical spec, user stories, acceptance criteria, roadmap, JIRA tickets, test cases, and verification report. Enforces 64 hard output rules.
06 Stage 6 — PRD Review
An independent AI review of the generated PRD for completeness, consistency, and actionability. Catches issues the generator missed — a second pair of eyes before implementation begins.
07 Stage 7 — Implementation
Creates a feature branch, implements code changes following the PRD and integration plan, builds the project, and runs tests. Enforces code quality, security, resilience, and testing rules.
08 Stage 8 — Drift Reconciliation
On macOS with Apple Intelligence, reconciles any drift between the PRD specification and the actual implementation. Catches cases where the code diverged from the plan. Skipped in CLI/Docker mode.
09 Stage 9 — Agreement
On macOS with Apple Intelligence, validates alignment between all pipeline artifacts — the PRD, implementation, and integration design. Skipped in CLI/Docker mode.
10 Stage 10 — Quality Gates
Deterministic checks — prohibited pattern detection, orphan file detection, build verification, full test suite, and deployment verification. No AI involved — pure structural validation.
11 Stage 11 — Semantic Verification
An independent verifier analyzes the git diff against the PRD. Checks alignment score (must be ≥ 0.7), cross-module consistency, anti-patterns, and solution genericity — flags hardcoded constants and non-extensible designs.
12 Stage 12 — Benchmark
Measures quality metrics and compares against baselines. Informational — does not block the pipeline. Tracks improvement over time.
13 Stage 13 — Deployment Simulation
Runs the configured deploy command in an isolated environment. Validates that migrations run, configuration applies, and the build artifact works. If no deploy command is configured, passes automatically.
14 Stage 14 — Pull Request
Creates a pull request per finding with a structured description: impact analysis summary, PRD excerpt, quality gate results, semantic verification score, and retry history. Everything the reviewer needs in one place.
Why this architecture matters
Stages, retries, and external validation.
Staged, not monolithic
Most AI coding agents run a single prompt-to-code pass. This pipeline breaks the work into 14 stages with defined inputs, outputs, and failure modes. Each stage can be independently inspected, retried, or replaced. You get artifacts at every step, not just at the end.
Retry, not abandon
When a quality gate fails or a benchmark regresses, the pipeline feeds the failure context back into the implementation stage and retries. Up to 3 times. Handles transient failures and flaky tests without giving up. Partial progress is never lost.
DeepMind validation: 7 of 7
Google DeepMind's "Intelligent AI Delegation" paper (February 2026) identifies 7 requirements for AI agents that survive real use. This pipeline implements all 7 — task decomposition, defined roles, verification, monitoring, accountability, coordination, and resilience. Built months before the paper was published. Coincidence, but a useful signal that the architecture is pointed in the right direction.
| DeepMind Requirement | Most AI Agents | AI Architect Pipeline |
|---|---|---|
| Task Decomposition | Prompt → Output | 14 sequential stages with dependency awareness |
| Defined Roles | Monolithic single-model | Specialized stages: analyzer, designer, implementer, verifier |
| Verification | None | Quality gates, semantic verification, benchmarking |
| Monitoring | Fire and forget | Per-stage artifacts and quality scores |
| Accountability | Unattributable failures | Traced claims, audit reports, requirement traceability |
| Coordination | Static templates | Dynamic strategy selection, context-aware retries |
| Resilience | Single point of failure | Stage-level retry, graceful degradation, partial recovery |
This is not CI/CD
CI/CD runs after code is written. This pipeline writes the code.
CI/CD triggers when a developer pushes code: build, test, deploy. It assumes a human wrote the code. This pipeline operates upstream — it takes a finding and produces the code, tests, and pull request that your CI/CD system then validates. It does not replace your build system. It feeds it.
| Aspect | Traditional CI/CD | AI Architect Pipeline |
|---|---|---|
| Trigger | Code push by developer | Finding: bug, feature, advisory |
| Input | Source code | Unstructured requirements |
| Output | Build artifact, deploy | Pull request with code, tests, docs |
| Code authorship | Human developer | Autonomous AI agent |
| Verification | Test suite only | Tests + semantic verification + benchmarks |
| Retry strategy | Re-run same build | Re-implement with failure context |
| Scope | Build → Deploy | Finding → Pull Request |
It generates the commits. Your CI/CD validates and deploys them.
Works with your stack
Adapts to your project through configuration, not code changes.
The pipeline reads your project's configuration files, test framework, build system, and directory structure. It runs your test suite natively and respects your linter settings. Django backend, React frontend, Go microservice, Rust CLI, Swift iOS app — same pipeline, different configuration.