New: `automatised-pipeline` — the codebase intelligence layer

The 14-stage Swift pipeline below is the integrated proof-of-concept. automatised-pipeline is the new Rust MCP server that handles the read-only intelligence half — it sits between “here is a finding” and “here is a PRD” and tells the rest of the system what is true about the code.

What it does

Indexes Rust / Python / TypeScript codebases into a property graph (LadybugDB)
Resolves call chains across files; never hallucinates symbols
Detects functional communities (Leiden-class)
Traces processes from entry points
Hybrid BM25 + sparse TF-IDF + RRF search
Validates PRDs against the actual graph (no fictional symbols)
Security gates: auth touch, unsafe code, public API change, coverage gap
Semantic diff: before/after graph, regression score, new cycles

By the numbers

23 MCP tools — one per pipeline stage
10 stages — numbered 0 through 9
220 tests passing, zero warnings
12,000+ lines of Rust
Read-only — never writes code, opens PRs, or runs CI
Integrates with Cortex (workflow graph) and zetetic-team-subagents

        # In Claude Code (recommended) — now part of the agentic-ai monorepo:

        > /plugin marketplace add cdeust/agentic-ai

        > /plugin install codebase@agentic-ai

        # Or build manually:

        $ git clone https://github.com/cdeust/automatised-pipeline.git

        $ cd automatised-pipeline && cargo build --release

        $ claude mcp add ai-architect -- ./target/release/ai-architect-mcp

The plugin compiles the LadybugDB C++ core + Rust binary on first install (~5 min). Cached after. Requires Rust 1.94+ and CMake.

View on GitHub

What you get from the integrated 14-stage pipeline

$ /run-pipeline

[01/14] Parse Findings ............. PASS   1m 12s
[02/14] Impact Analysis ............ PASS   2m 48s
[03/14] Integration Design ......... PASS   3m 19s
[04/14] Plan Deliberation .......... Apple Intelligence only
[05/14] PRD Generation ............. PASS  18m 34s
[06/14] PRD Review ................. PASS   4m 11s
[07/14] Implementation ............. PASS  27m 03s
[08/14] Drift Reconciliation ....... Apple Intelligence only
[09/14] Agreement .................. Apple Intelligence only
[10/14] Quality Gates .............. PASS   2m 46s
[11/14] Semantic Verification ...... PASS   5m 22s
[12/14] Benchmark .................. PASS   1m 15s
[13/14] Deployment Simulation ...... PASS   1m 41s
[14/14] Pull Request ............... PASS   1m 08s

Pipeline complete. PR #142 created.
12 files changed, 1,847 insertions(+), 203 deletions(-)

What this pipeline does

It takes a finding and tries to ship a pull request without asking you anything.

Give it a bug report, a feature request, a research paper, or any actionable input. The pipeline parses it, scores the impact, designs the integration, generates a verified PRD, implements the code, runs your test suite, verifies the implementation against the spec, benchmarks the result, and opens a pull request. One command: /run-pipeline.

It is not magic. The pipeline is 14 sequential stages, each with defined inputs, outputs, and failure modes. When a stage fails, it retries with the failure context — up to 3 times — before moving on. 67 deterministic rules enforce output quality. A semantic verifier reads the diff against the PRD without having written a single line of the code it is checking.

It works with Python, TypeScript, Go, Rust, Java, Kotlin, Swift, and more — through configuration, not code changes. It is a PoC that shows the architecture for autonomous development. The architecture is there to be refined.

Get Started in 2 Commands

Install, run. The pipeline handles the rest.

1

Install the plugin

            # In Claude Code:

            > /plugin marketplace add cdeust/ai-architect-feedback-loop

            > /plugin install ai-architect-feedback-loop

2

Start the pipeline

            # In Claude Code:

            > /run-pipeline

The pipeline discovers findings, analyzes impact, generates a PRD, implements code, verifies the result, and delivers a pull request.

14 stages, each with defined inputs and outputs

Each stage runs autonomously and produces artifacts you can inspect. When a stage fails, the pipeline retries with the failure context.

Discovery

01 Parse Findings

Analysis

02 Impact Analysis

03 Integration Design

04 Plan Deliberation

05 PRD Generation

06 PRD Review

Implementation

07 Implementation

08 Drift Reconciliation

09 Agreement

10 Quality Gates

11 Semantic Verification

Delivery

12 Benchmark

13 Deployment Simulation

14 Pull Request

01 Stage 1 — Parse Findings

Reads input — bug reports, feature requests, security advisories, research papers — filters by relevance category and score threshold, and produces a ranked list prioritized by multi-module impact.

02 Stage 2 — Impact Analysis

Computes a compound impact score across four dimensions: modules affected, propagation depth, contract impact, and test coverage delta. Findings must score above threshold and affect multiple modules to proceed.

03 Stage 3 — Integration Design

Designs architectural modifications respecting the target product's patterns. Enforces design principles: parameterization, centralization, composability, backward compatibility. Validates that all referenced files exist.

04 Stage 4 — Plan Deliberation

On macOS with Apple Intelligence, deliberates on the integration plan using on-device foundation models. Adds a second perspective before committing to the design. Skipped in CLI/Docker mode.

05 Stage 5 — PRD Generation

Invokes the AI PRD Generator to produce 9 verified files: overview, requirements, technical spec, user stories, acceptance criteria, roadmap, JIRA tickets, test cases, and verification report. Enforces 64 hard output rules.

06 Stage 6 — PRD Review

An independent AI review of the generated PRD for completeness, consistency, and actionability. Catches issues the generator missed — a second pair of eyes before implementation begins.

07 Stage 7 — Implementation

Creates a feature branch, implements code changes following the PRD and integration plan, builds the project, and runs tests. Enforces code quality, security, resilience, and testing rules.

08 Stage 8 — Drift Reconciliation

On macOS with Apple Intelligence, reconciles any drift between the PRD specification and the actual implementation. Catches cases where the code diverged from the plan. Skipped in CLI/Docker mode.

09 Stage 9 — Agreement

On macOS with Apple Intelligence, validates alignment between all pipeline artifacts — the PRD, implementation, and integration design. Skipped in CLI/Docker mode.

10 Stage 10 — Quality Gates

Deterministic checks — prohibited pattern detection, orphan file detection, build verification, full test suite, and deployment verification. No AI involved — pure structural validation.

11 Stage 11 — Semantic Verification

An independent verifier analyzes the git diff against the PRD. Checks alignment score (must be ≥ 0.7), cross-module consistency, anti-patterns, and solution genericity — flags hardcoded constants and non-extensible designs.

12 Stage 12 — Benchmark

Measures quality metrics and compares against baselines. Informational — does not block the pipeline. Tracks improvement over time.

13 Stage 13 — Deployment Simulation

Runs the configured deploy command in an isolated environment. Validates that migrations run, configuration applies, and the build artifact works. If no deploy command is configured, passes automatically.

14 Stage 14 — Pull Request

Creates a pull request per finding with a structured description: impact analysis summary, PRD excerpt, quality gate results, semantic verification score, and retry history. Everything the reviewer needs in one place.

Why this architecture matters

Stages, retries, and external validation.

Staged, not monolithic

Most AI coding agents run a single prompt-to-code pass. This pipeline breaks the work into 14 stages with defined inputs, outputs, and failure modes. Each stage can be independently inspected, retried, or replaced. You get artifacts at every step, not just at the end.

Retry, not abandon

When a quality gate fails or a benchmark regresses, the pipeline feeds the failure context back into the implementation stage and retries. Up to 3 times. Handles transient failures and flaky tests without giving up. Partial progress is never lost.

DeepMind validation: 7 of 7

Google DeepMind's "Intelligent AI Delegation" paper (February 2026) identifies 7 requirements for AI agents that survive real use. This pipeline implements all 7 — task decomposition, defined roles, verification, monitoring, accountability, coordination, and resilience. Built months before the paper was published. Coincidence, but a useful signal that the architecture is pointed in the right direction.

DeepMind Requirement	Most AI Agents	AI Architect Pipeline
Task Decomposition	Prompt → Output	14 sequential stages with dependency awareness
Defined Roles	Monolithic single-model	Specialized stages: analyzer, designer, implementer, verifier
Verification	None	Quality gates, semantic verification, benchmarking
Monitoring	Fire and forget	Per-stage artifacts and quality scores
Accountability	Unattributable failures	Traced claims, audit reports, requirement traceability
Coordination	Static templates	Dynamic strategy selection, context-aware retries
Resilience	Single point of failure	Stage-level retry, graceful degradation, partial recovery

Read the paper → arxiv.org/abs/2602.11865

This is not CI/CD

CI/CD runs after code is written. This pipeline writes the code.

CI/CD triggers when a developer pushes code: build, test, deploy. It assumes a human wrote the code. This pipeline operates upstream — it takes a finding and produces the code, tests, and pull request that your CI/CD system then validates. It does not replace your build system. It feeds it.

Aspect	Traditional CI/CD	AI Architect Pipeline
Trigger	Code push by developer	Finding: bug, feature, advisory
Input	Source code	Unstructured requirements
Output	Build artifact, deploy	Pull request with code, tests, docs
Code authorship	Human developer	Autonomous AI agent
Verification	Test suite only	Tests + semantic verification + benchmarks
Retry strategy	Re-run same build	Re-implement with failure context
Scope	Build → Deploy	Finding → Pull Request

It generates the commits. Your CI/CD validates and deploys them.

Works with your stack

Adapts to your project through configuration, not code changes.

Python

TypeScript

JavaScript

Go

Rust

Java

Kotlin

Swift

C / C++

Ruby

PHP

And more

The pipeline reads your project's configuration files, test framework, build system, and directory structure. It runs your test suite natively and respects your linter settings. Django backend, React frontend, Go microservice, Rust CLI, Swift iOS app — same pipeline, different configuration.

Get Started on GitHub

From finding to pull request.14 stages. One command.

New: automatised-pipeline — the codebase intelligence layer