@usex/audit · driven by your Claude subscription

audit

Many narrow agents·deliberate disagreement·reachability as the gate

An 8-stage AI vulnerability-discovery agent. It finds real bugs by running many narrow agents in parallel — then a different model tries to disprove every finding, and only confirmed & reachable issues ship.

Get started ★ Star on GitHub
Why it's different

Single-pass scanners ship noise.
audit ships proof.

Five design decisions that separate real, reachable findings from the flood of plausible-but-wrong guesses.

🎯

Narrow agents, not one mega-prompt

One attack class per task, with the trust boundary spelled out. Focused hunters surface bugs a "find bugs here" prompt never will.

🥊

Deliberate disagreement

Validate runs on a different model than Hunt and is paid in rejections — it filters the noise single-pass tools ship.

🔓

Reachability is the gate

A "buggy" sink no attacker input can reach is dropped. Only confirmed and reachable findings make the report.

🔁

It learns as it runs

A proven-reachable bug seeds new hunts for the same pattern elsewhere in the repo — siblings get found too.

🧾

Schema-validated & resumable

Every agent output is shape-checked against JSON Schema, every run is checkpointed in SQLite, and a cost ceiling aborts cleanly.

🔒

Subscription billing by default

Runs on the official Claude Agent SDK — no API key needed. The metered API key is scrubbed so it can't route to billing.

The pipeline

8 stages, two loops,
one question that matters.

Recon → Hunt → Validate → Gapfill ↺ → Dedupe → Trace → Feedback ↺ → Report

1

Recon

Opus

Maps the repo + git history and emits narrowly-scoped Hunt tasks — one attack class, concrete files, explicit trust boundary.

2

Hunt

Sonnet

One attack class per agent, run in parallel. Compiles and runs real PoCs to prove the bug rather than guessing.

3

Validate

Opus

An adversarial re-read on a different model that tries to disprove each finding. The skeptic that kills false positives.

4

Gapfill

Sonnet loop

Re-queues under-covered subsystem × attack class cells so coverage expands where hunters drifted away.

5

Dedupe

Sonnet

Clusters findings strictly by root cause — many call sites of one buggy helper collapse into a single fixable issue.

6

Trace

Opusthe gate

Proves attacker-controlled input actually reaches the sink from an external entry point. Unreachable = out of scope.

7

Feedback

Sonnet loop

Turns each reachable trace into new hunts for structurally similar siblings elsewhere — the learning loop.

8

Report

Sonnet

A schema-validated, structured report — reachable-only, severity-consistent, with the entry-point→sink trace attached.

Real output

Pointed at a Flask app,
it found a chain nobody planted.

Two planted bugs in — seven confirmed & reachable findings out, including a zero-credential SQLi→RCE pivot the planted bugs never spelled out.

audit report --run-id demo --format md
Built for real workflows

From a one-off scan
to every pull request.

Diff / PR mode. --base/--since scope the scan to changed files + blast radius. A PR scan costs cents.
Baseline & delta. Fingerprint findings, suppress known ones, surface only what's NEW or FIXED.
SARIF + exit-code gating. --fail-on high for CI; trace ships as codeFlows to the GitHub Security tab.
Auto-fix (opt-in). audit fix writes a minimal patch + regression test in an isolated worktree; --open-pr opens a draft.
Code-grounded advice. audit advise reads the real sink and explains the fix for your code, inline in the report.
Triage viewer. --serve a local web UI to confirm / dismiss findings and export suppressions.
Bug-bounty / VDP triage. Reproduce an inbound submission, run it through the reviewer + gate, emit accept/reject/duplicate.
Live-target mode. Reproduce findings against a running deployment with real HTTP.
Cost observability. audit stats breaks spend down by stage/model and reports cost-per-finding.
Background runs. audit run -d detaches the pipeline; audit sessions lists what's alive.
Quickstart

Auditing in under a minute.

terminal⧉ copy
# 1 · install globally — requires Bun ≥ 1.3
bun add -g @usex/audit

# 2 · already logged in via `claude login`? done.
audit auth-check

# 3 · cd into the repo you want to audit
cd /path/to/target
audit run --run-id my-run

# 4 · read the report
audit report --run-id my-run --format md > report.md
1

Install

One global binary on your PATH, running on the Bun runtime.

2

Authenticate

Uses your Claude Pro / Max subscription — no API key. audit auth-check confirms it.

3

Run

Point it at any repo. State and artifacts land in the working directory; runs are resumable and budgeted.

4

Report

Export Markdown, JSON, or SARIF — every finding carries its reachability proof.

Find the bugs
that can actually be reached.

Open source · MIT · driven by your Claude subscription.