Agents that say
“I don't know.”
97 reasoning patterns drawn from primary sources — Dijkstra, Curie, Pearl, Peirce, Feynman, Cochrane — plus 19 team-role specialists. Each agent cites its primary paper and documents the conditions under which it must refuse. A pre-commit hook blocks any constant lacking a source. Not a prompt library. A methodology with commit-time enforcement.
Requires Claude Code. The plugin's installer copies agents, skills, hooks, and tools into ~/.claude/.
The system enforcing its own standard
When you commit code, the hook runs tools/zetetic-checker.sh against the staged changes. Unsourced absolute claims and magic floats fail the gate.
$ git commit -m "tune retry backoff" UNSOURCED (error) retry.py:1: # It always works MAGIC_NUMBER (error) retry.py:2: DELAY = 2.741592 Profile: strict (staged mode) Errors: 2 (blocking) Warnings: 0 (informational — promoted to errors when profile=strict) FAILED: 2 blocking violation(s). BLOCKED: Zetetic violations in staged files.
The commit re-runs once each flagged line carries a # source: comment, a benchmark reference, or a measured-on note.
What you actually get
A collection of reasoning procedures with citations — not personas pretending to be smart.
| Capability | What it gives you |
|---|---|
| 97 documented refusals | Each genius agent's body documents conditions under which it refuses (when to stop, what to cite, when to hand off). |
| 63 multi-step workflows | Type one slash command, get a sourced research brief / debugging trace / ADR. |
| Commit-time gates | UNSOURCED keywords (always/never/obviously) block at any profile. MAGIC_NUMBER floats and TODO_NO_REF warn at default, block under ZETETIC_PROFILE=strict. |
| 650+ problem-shape triggers | Natural-language problem descriptions map to reasoning methods automatically. |
Routed by problem shape, not field
Most AI agent libraries ship “pretend to be Einstein.” This ships Einstein's method — gedankenexperiment, operational definitions, equivalence-principle reasoning — with the citations, the canonical moves, the documented blind spots, and the conditions under which the agent must refuse.
Measurement & Signal
“The measurement exceeds what known parts predict.”
- Curie — residual carrier
- Ekman — anatomical anchoring
- Wu — testing the obvious
- Mandelbrot — fat-tail detection
Causal & Abductive
“Does X cause Y, or is it confounded?”
- Pearl — do-calculus, ladder of causation
- Peirce — abductive inference cycle
- Snow / Hill — epidemiology, 9 criteria
- Mill — methods of agreement / difference
Formal & Correctness
“Can we prove this correct?”
- Dijkstra — precondition / postcondition
- Lamport — happens-before, no global now
- Pāṇini — generative specifications
- Gödel — incompleteness limits
- Turing — computability before optimization
Failure & Resilience
“What happens when everything goes wrong?”
- Hamilton — priority-displaced scheduling
- Taleb — fragile / robust / antifragile
- Carnot — theoretical efficiency limits
- Boyd — OODA, fast transients
Decision & Bias
“Is this decision driven by bias?”
- Kahneman — System 1/2 debiasing
- Schön — reflection-in-action
- Roger Fisher — principled negotiation, BATNA
- Simon — bounded rationality, satisficing
Ethics & Justice
“Who benefits and who bears the cost?”
- Rawls — veil of ignorance, difference principle
- Arendt — thoughtlessness audit
- Le Guin — what will be lost, narrative frame
- Ostrom — commons governance, 8 design principles
Full routing table — 400+ triggers, pairings, composition chains — in agents/genius/INDEX.md.
What you type → what happens
Each is a multi-step pipeline that names the procedure used, surfaces blind spots in its output, and refuses to ship if a step fails.
/paper-vs-code-audit
Extracts every claim from a paper → finds corresponding code → flags mismatches → produces a traceability matrix.
/autoresearch-loop
Hypothesis → implement → commit → benchmark → keep / revert → iterate until diminishing returns.
/deep-research
Plans search → parallel researchers → synthesizes → verifies citations → writes cited brief + provenance sidecar.
/incident-investigation
Forensic timeline → three-timescale decomposition → common vs special cause → structural root cause → remediation.
/genius route
Routes a problem description (“p99 latency exceeds the sum of profiled components”) to the reasoning procedure that fits its shape.
/genius compose
Chains multiple reasoning procedures in sequence — a Curie residual → Pearl confounding check → Dijkstra correctness proof, all in one pipeline.
Pairs with Cortex memory
Each zetetic agent writes to its own topic in Cortex. Decisions auto-propagate. The next session inherits not just what was decided but who decided it.
Specialization
Engineer's debugging notes don't clutter tester's recall. Each agent has its own memory namespace.
Coordination
Decisions auto-protect (Wegner 1987). When engineer decides “use Redis over Memcached,” every agent sees it next session.
Briefing
SubagentStart hook extracts task keywords, queries prior work, fetches team decisions, and injects as context prefix.
Sourcing chain
When an agent cites a paper, that citation gets indexed in Cortex. Future sessions retrieve the citation along with the claim.
Install
One command. The plugin's installer copies agents, skills, hooks, and tools into ~/.claude/.
Option A — Claude Code Marketplace (recommended)
> claude plugin install zetetic-team-subagents
After install, restart your Claude Code session. Agents become available via the Agent tool with subagent_type; skills become available via /<skill-name> slash commands.
Option B — Manual clone
$ cd zetetic-team-subagents
$ bash scripts/install.sh
For advanced configuration — selective agent install, custom hooks, ZETETIC_PROFILE tuning — see docs/INSTALL.md.
Configure the strict profile (optional)
Strict mode promotes MAGIC_NUMBER and TODO_NO_REF from warnings to blocking errors. Default mode keeps them informational. UNSOURCED always blocks.
Free & Open Source
MIT licensed. 241 tests. 116 agents. 63 skills. 16 hooks. Every commit is checked.