Open Source — Free Forever

From finding to pull request.
14 stages. One command.

A bug report goes in. A pull request comes out. 14 stages in between, zero prompts. Does it always produce a merge-ready PR? No. But it runs end to end, and it shows where autonomous coding agents actually break down.

> /plugin marketplace add cdeust/ai-architect-feedback-loop
> /plugin install ai-architect-feedback-loop

Requires macOS 14+ with Apple Silicon (M1/M2/M3/M4)

What you get

$ /run-pipeline

[01/14] Parse Findings ............. PASS   1m 12s
[02/14] Impact Analysis ............ PASS   2m 48s
[03/14] Integration Design ......... PASS   3m 19s
[04/14] Plan Deliberation .......... Apple Intelligence only
[05/14] PRD Generation ............. PASS  18m 34s
[06/14] PRD Review ................. PASS   4m 11s
[07/14] Implementation ............. PASS  27m 03s
[08/14] Drift Reconciliation ....... Apple Intelligence only
[09/14] Agreement .................. Apple Intelligence only
[10/14] Quality Gates .............. PASS   2m 46s
[11/14] Semantic Verification ...... PASS   5m 22s
[12/14] Benchmark .................. PASS   1m 15s
[13/14] Deployment Simulation ...... PASS   1m 41s
[14/14] Pull Request ............... PASS   1m 08s

Pipeline complete. PR #142 created.
12 files changed, 1,847 insertions(+), 203 deletions(-)

What this pipeline does

It takes a finding and tries to ship a pull request without asking you anything.

Give it a bug report, a feature request, a research paper, or any actionable input. The pipeline parses it, scores the impact, designs the integration, generates a verified PRD, implements the code, runs your test suite, verifies the implementation against the spec, benchmarks the result, and opens a pull request. One command: /run-pipeline.

It is not magic. The pipeline is 14 sequential stages, each with defined inputs, outputs, and failure modes. When a stage fails, it retries with the failure context — up to 3 times — before moving on. 67 deterministic rules enforce output quality. A semantic verifier reads the diff against the PRD without having written a single line of the code it is checking.

It works with Python, TypeScript, Go, Rust, Java, Kotlin, Swift, and more — through configuration, not code changes. It is a PoC that shows the architecture for autonomous development. The architecture is there to be refined.

Get Started in 2 Commands

Install, run. The pipeline handles the rest.

1

Install the plugin

# In Claude Code:
> /plugin marketplace add cdeust/ai-architect-feedback-loop
> /plugin install ai-architect-feedback-loop
2

Start the pipeline

# In Claude Code:
> /run-pipeline

The pipeline discovers findings, analyzes impact, generates a PRD, implements code, verifies the result, and delivers a pull request.

14 stages, each with defined inputs and outputs

Each stage runs autonomously and produces artifacts you can inspect. When a stage fails, the pipeline retries with the failure context.

Discovery
01 Parse Findings
Analysis
02 Impact Analysis
03 Integration Design
04 Plan Deliberation
05 PRD Generation
06 PRD Review
Implementation
07 Implementation
08 Drift Reconciliation
09 Agreement
10 Quality Gates
11 Semantic Verification
Delivery
12 Benchmark
13 Deployment Simulation
14 Pull Request

01 Stage 1 — Parse Findings

Reads input — bug reports, feature requests, security advisories, research papers — filters by relevance category and score threshold, and produces a ranked list prioritized by multi-module impact.

02 Stage 2 — Impact Analysis

Computes a compound impact score across four dimensions: modules affected, propagation depth, contract impact, and test coverage delta. Findings must score above threshold and affect multiple modules to proceed.

03 Stage 3 — Integration Design

Designs architectural modifications respecting the target product's patterns. Enforces design principles: parameterization, centralization, composability, backward compatibility. Validates that all referenced files exist.

04 Stage 4 — Plan Deliberation

On macOS with Apple Intelligence, deliberates on the integration plan using on-device foundation models. Adds a second perspective before committing to the design. Skipped in CLI/Docker mode.

05 Stage 5 — PRD Generation

Invokes the AI PRD Generator to produce 9 verified files: overview, requirements, technical spec, user stories, acceptance criteria, roadmap, JIRA tickets, test cases, and verification report. Enforces 64 hard output rules.

06 Stage 6 — PRD Review

An independent AI review of the generated PRD for completeness, consistency, and actionability. Catches issues the generator missed — a second pair of eyes before implementation begins.

07 Stage 7 — Implementation

Creates a feature branch, implements code changes following the PRD and integration plan, builds the project, and runs tests. Enforces code quality, security, resilience, and testing rules.

08 Stage 8 — Drift Reconciliation

On macOS with Apple Intelligence, reconciles any drift between the PRD specification and the actual implementation. Catches cases where the code diverged from the plan. Skipped in CLI/Docker mode.

09 Stage 9 — Agreement

On macOS with Apple Intelligence, validates alignment between all pipeline artifacts — the PRD, implementation, and integration design. Skipped in CLI/Docker mode.

10 Stage 10 — Quality Gates

Deterministic checks — prohibited pattern detection, orphan file detection, build verification, full test suite, and deployment verification. No AI involved — pure structural validation.

11 Stage 11 — Semantic Verification

An independent verifier analyzes the git diff against the PRD. Checks alignment score (must be ≥ 0.7), cross-module consistency, anti-patterns, and solution genericity — flags hardcoded constants and non-extensible designs.

12 Stage 12 — Benchmark

Measures quality metrics and compares against baselines. Informational — does not block the pipeline. Tracks improvement over time.

13 Stage 13 — Deployment Simulation

Runs the configured deploy command in an isolated environment. Validates that migrations run, configuration applies, and the build artifact works. If no deploy command is configured, passes automatically.

14 Stage 14 — Pull Request

Creates a pull request per finding with a structured description: impact analysis summary, PRD excerpt, quality gate results, semantic verification score, and retry history. Everything the reviewer needs in one place.

Why this architecture matters

Stages, retries, and external validation.

Staged, not monolithic

Most AI coding agents run a single prompt-to-code pass. This pipeline breaks the work into 14 stages with defined inputs, outputs, and failure modes. Each stage can be independently inspected, retried, or replaced. You get artifacts at every step, not just at the end.

Retry, not abandon

When a quality gate fails or a benchmark regresses, the pipeline feeds the failure context back into the implementation stage and retries. Up to 3 times. Handles transient failures and flaky tests without giving up. Partial progress is never lost.

DeepMind validation: 7 of 7

Google DeepMind's "Intelligent AI Delegation" paper (February 2026) identifies 7 requirements for AI agents that survive real use. This pipeline implements all 7 — task decomposition, defined roles, verification, monitoring, accountability, coordination, and resilience. Built months before the paper was published. Coincidence, but a useful signal that the architecture is pointed in the right direction.

DeepMind RequirementMost AI AgentsAI Architect Pipeline
Task DecompositionPrompt → Output14 sequential stages with dependency awareness
Defined RolesMonolithic single-modelSpecialized stages: analyzer, designer, implementer, verifier
VerificationNoneQuality gates, semantic verification, benchmarking
MonitoringFire and forgetPer-stage artifacts and quality scores
AccountabilityUnattributable failuresTraced claims, audit reports, requirement traceability
CoordinationStatic templatesDynamic strategy selection, context-aware retries
ResilienceSingle point of failureStage-level retry, graceful degradation, partial recovery
Read the paper → arxiv.org/abs/2602.11865

This is not CI/CD

CI/CD runs after code is written. This pipeline writes the code.

CI/CD triggers when a developer pushes code: build, test, deploy. It assumes a human wrote the code. This pipeline operates upstream — it takes a finding and produces the code, tests, and pull request that your CI/CD system then validates. It does not replace your build system. It feeds it.

AspectTraditional CI/CDAI Architect Pipeline
TriggerCode push by developerFinding: bug, feature, advisory
InputSource codeUnstructured requirements
OutputBuild artifact, deployPull request with code, tests, docs
Code authorshipHuman developerAutonomous AI agent
VerificationTest suite onlyTests + semantic verification + benchmarks
Retry strategyRe-run same buildRe-implement with failure context
ScopeBuild → DeployFinding → Pull Request

It generates the commits. Your CI/CD validates and deploys them.

Works with your stack

Adapts to your project through configuration, not code changes.

Python
TypeScript
JavaScript
Go
Rust
Java
Kotlin
Swift
C / C++
Ruby
PHP
And more

The pipeline reads your project's configuration files, test framework, build system, and directory structure. It runs your test suite natively and respects your linter settings. Django backend, React frontend, Go microservice, Rust CLI, Swift iOS app — same pipeline, different configuration.

Get Started on GitHub