# Bug Audit Plan — rclone-jav (Python CLI + Brave Extension) Customized from `D:\DEV\Project\Goal\bug-audit-template.md`. Tightened for this project: scope is chunked, "bug" is narrowed, reproduction recipe is required, independent verification is enforced via fresh-context agents with bounded contract context, intentional patterns are listed only when verified against current code or current doc. All output artifacts (per-scope `bugs-*.md` files, `bugs-candidates-*.md` scratch, `audit-snapshot-.md`, and the final `verification.md`) live under `D:\DEV\Project\rclone-jav\`. Do NOT write audit output under `D:\DEV\Extensions\Production\rclone-jav\` (extension folder) or `D:\DEV\Project\Goal\` (template home). --- ## What counts as a bug (for THIS audit) Include: - **Wrong result** — code produces output that contradicts documented behavior, comment, or stated intent - **Data loss / corruption** — cache.json, config.json, chrome.storage, or remote file content can become incorrect or lost - **Crash / unhandled exception** — Python tracebacks, uncaught JS promise rejections that kill an operation - **Silent failure** — operation appears to succeed but didn't (e.g. write claimed but file not changed) - **Contract violation** — host RPC schema mismatch, manifest declaration mismatch, cache-version mismatch, fixture-driven expectation broken - **Race condition with observable user-visible effect** — concurrent operations leading to one of the above Exclude (out of scope for this audit — separate effort): - Code style / formatting / linting - Performance unless it causes timeout or hang - Dead code / unused imports / unused variables - Outdated comments (unless misleading enough to cause wrong-result) - Security review (use `/security-review` instead) - Documentation gaps (separate doc-debt pass) - Refactor opportunities ("could be cleaner") - Missing features → file in `TODO.md`, not `bugs.md` Phrase findings as "every function reviewed for externally observable bugs." Internal helpers with no flow to RPC / UI / file system / network get reviewed only as part of their caller's flow, not as their own audit unit. --- ## Scope chunks (run each as separate audit pass) Five chunks. Each gets its own `bugs-.md` file. Do NOT batch into one giant audit — context grows, hallucinations multiply. | # | Chunk | Files in scope | Output | |---|---|---|---| | 1 | **Python CLI** | `rc-jav.py` + `rcjav/*.py` + `tests/*.py` + `fixtures/run.py` (all under `D:\DEV\Project\rclone-jav\`) | `bugs-python.md` | | 2 | **Native host** | `host\rcjav-host.py` + `host\install-host.ps1` + `host\rcjav-host.bat` + `host\register-host.bat` (under `D:\DEV\Extensions\Production\rclone-jav\`) | `bugs-host.md` | | 3 | **Extension SW + content** | `background.js` + `content.js` + `manifest.json` (under `D:\DEV\Extensions\Production\rclone-jav\`) | `bugs-extension-bg.md` | | 4 | **Extension Options pages** | `src\options\*` (under `D:\DEV\Extensions\Production\rclone-jav\`) | `bugs-extension-options.md` | | 5 | **Extension Popup + Bulk Check** | `src\popup\*` + `src\bulk-check\*` (under `D:\DEV\Extensions\Production\rclone-jav\`) | `bugs-extension-popup.md` | Tabvault extension (`D:\DEV\Extensions\Production\tabvault\`) is **out of scope** for this audit — separate project. ### Explicit per-chunk excludes Do NOT audit (read-only-if-needed-for-context, never report findings against): - `**/__pycache__/` — bytecode - `**/*.bak` — historical snapshots (e.g. `CLAUDE.md.bak`, `cache.json.bak`) - `cache.json`, `config.json` — runtime data, not code (their schema is auditable in `docs/CACHE_CONTRACT.md`) - `benchmarks/*.py` — performance probes, not product - `mockups/*.html` — design memory, not code - `wincatalog/` — user data dir - `README.md`, `TODO.md`, `AGENTS.md`, `CLAUDE.md`, `docs/*.md` — docs (separate doc-debt pass) - `host/logs/*` — runtime logs - `host/state/*` — runtime state - `host/com.rcjav.host.json`, `host/allowed-extension-ids.json` — generated/runtime config - Per-project memory under `C:\Users\admin\.claude\projects\D--DEV-Project-rclone-jav\memory\` — READ for rules, do NOT audit --- ## Required reading before audit Auditor MUST read (and reference findings against) the following intentional-pattern docs: - `D:\DEV\Project\rclone-jav\AGENTS.md` — Python CLI session memory, ID normalization rules, defaults - `D:\DEV\Project\rclone-jav\CLAUDE.md` (if present) - `D:\DEV\Project\rclone-jav\TODO.md` — deferred work that's NOT a bug - `D:\DEV\Extensions\Production\rclone-jav\docs\CACHE_CONTRACT.md` — cache schema + ID rules versioning - `D:\DEV\Extensions\Production\rclone-jav\AGENTS.md` — extension session memory - `D:\DEV\Extensions\Production\rclone-jav\CLAUDE.md` (if present) - `D:\DEV\Extensions\Production\rclone-jav\mockups\console-consolidation-claude.html` — design rationale - `C:\Users\admin\.claude\projects\D--DEV-Project-rclone-jav\memory\*.md` — per-project memory (version bump rule, install workflow, no hollow suggestions) If a finding contradicts an explicit decision in these docs, it's NOT a bug — it's expected behavior. Mark as `discarded — intentional per ` in the False Positives section. --- ## Known intentional patterns (verified against current code or current doc) Only patterns confirmed against the current snapshot belong here. If a pattern is suspected but unverified, leave it OFF this list — the auditor will surface it, the verifier will check the cited doc, and discard-as-intentional happens there. **Stale assumptions on this list are dangerous** — they actively shield real bugs in code that's been touched. ### Python CLI (verified) - `extract_id()` chops trailing single letters from filenames intentionally (e.g. `IBW-902z` → `IBW-902`) — see `D:\DEV\Project\rclone-jav\AGENTS.md` "ID normalization" - JAV IDs canonicalized to at least 3 digits but keep wider widths (`ABC-027`, `ABCDE-1167`) — not a "leading zero" bug - `.ts` ranks lowest among video containers in dupe keep ranking — `AGENTS.md` "Defaults from earlier sessions" - VIP folders (`ClearJAV` default) win first in dupe keep ranking — same - Cache loading falls back to empty cache when malformed top-level — intentional resilience, `AGENTS.md` "Recent decisions" - Scan is always recursive — old `--recursive/-R` flag was removed intentionally - `extract_json_blob` tolerates leading status lines + trailing noise — intentional for `--basic` output parsing ### Native host (verified) - stderr capture lives INSIDE `rcjav-host.py` via `os.dup2` (not in `rcjav-host.bat` via `2>>`) — the bat NOT redirecting stderr is the fix, not a missing-redirect bug. See comments at top of `rcjav-host.bat`. - `__port_disconnect__` is a synthetic action name for the rolling RPC log marker — not an actual RPC handler - `_shrink_response` called twice (once in main loop, once inside `write_message`) — defense-in-depth, intentional - `client_req_id` is `None` for RPCs originating from rclone-jav extension (only tabvault stamps it) - Discord webhook rate-limit uses `last-alert-ts.json` shared across host process spawns — intentional anti-storm - Host spawns fresh per `connectNative` call from each extension — intentional Chromium behavior, not a "leak" ### Extension (verified against current files) - `chrome.runtime.lastError` voided after several Chrome API calls — silences MV3 warning, intentional - Native messaging 90s timeout in `nativeCall` — long enough for `--quick` on a slow remote - `web_accessible_resources` for `src/options/options.html` and `src/bulk-check/bulk-check.html` ONLY (NOT `popup.html`) — explicit per `mockups/console-consolidation-claude.html`; popup is browser-action UI, doesn't need WAR - Library Issues report-only kinds (`resolution_*`, `quality_marker_not_resolution`, `missing_resolution`, etc.) — user-chosen per session; not a "missing fix path" bug. Auto-rename only valid for `bracket_id` and `nohyphen_id`. - `No ID` chip removed from sidebar; `no_id` outcomes not logged to recent activity — intentional - Default landing pane = `dupe-review` — per mockup - Setup pane lives in SUPPORT sidebar group — current intentional placement after earlier orphaning/restoration - `pcLabel` empty string default — intentional, user opt-in - 10-minute Discord webhook rate-limit — intentional anti-spam - `mkv` / `mp4` / `wmv` / `avi` format-preference defaults — intentional KEEP-ranking order - Default `cacheStaleHours` = 24 — display only, doesn't change search results - `_rcjavSwInstanceId` is a fresh UUID per SW startup — used to detect SW eviction mid-call, intentional design ### Not on this list — let auditor surface (do NOT shield) - `DEFAULT_TARGET` / `DEFAULT_SOURCE` hardcoded fallback values in `rcjav/cli.py` — these have been a regression source. Auditor checks current values vs `config.json` defaults vs `AGENTS.md` documented current state. - `CONFIG_PATH` / `CACHE_PATH` / `CANCEL_FLAG` / `DEFAULT_CATALOG` path resolutions in `rcjav/` package — `.parent` vs `.parents[1]` has been a bug. Verify each against current package layout. - Any other path-resolution code that uses `__file__` — same class of risk --- ## Snapshot preflight (MANDATORY — Phase 1 cannot start without it) Before any audit chunk runs, capture `D:\DEV\Project\rclone-jav\audit-snapshot-.md` with: ```markdown # Audit Snapshot — ## CLI repo (D:\DEV\Project\rclone-jav) - git rev-parse HEAD: - git status --short: ## Extension repo (D:\DEV\Extensions\Production\rclone-jav) - git rev-parse HEAD: - git status --short: ## Versions - Extension manifest.json version: - Python: - Node: - Brave: ## Dirty-state policy This audit accepts dirty working trees (option b). All file:line citations reference the snapshot AS-IS at this timestamp. No file edits during Phase 1 except audit docs (allowed-write list below). ``` Every `bugs-*.md` file MUST cite this snapshot ID in its header. If files change during audit, restart from a new snapshot. --- ## Phase 1 allowed-write list (explicit) During Phase 1 (audit), the ONLY files that may be created or modified are: - `D:\DEV\Project\rclone-jav\audit-snapshot-.md` - `D:\DEV\Project\rclone-jav\bugs-candidates-.md` - `D:\DEV\Project\rclone-jav\bugs-.md` Any other write = audit violation. Restart the chunk from snapshot. --- ## bugs-candidates-.md format (Phase 1 scratch) This is the auditor's scratch space. Hedge language permitted here (and ONLY here). Theories, speculation, "this looks wrong" go in candidates first. ```markdown # Candidate Findings — ## Candidate C-1 - File: - Hunch: - Trace: - Question for verifier: - Contract refs needed: ## Candidate C-2 ... ``` Only CONFIRMED or PARTIAL candidates from verifier get promoted into `bugs-.md`. REFUTED or NEEDS-INFO stay in candidates with verifier's response appended. After Phase 1 chunk completes: `bugs-candidates-.md` stays beside `bugs-.md`. Optional archive under `D:\DEV\Project\rclone-jav\audits\\` — operator choice, not enforced. --- ## bugs-.md format (confirmed only) ```markdown # Bug Report — Snapshot: audit-snapshot-.md Required-reading docs read: [Y for each in list above] Auditor agent: --- ## Severe (S) Definition: data loss, crash, silent wrong result, contract violation that breaks user workflow. ### S-1 - **File:** `:` (single line OR `:-` range) - **Symptom (one sentence):** what the user / caller observes - **Why it's a bug:** concrete reason citing the contract / doc / comment it violates. NO hedge language: "could", "might", "potentially", "in theory", "may cause", "possibly" — if you can't trace it concretely, demote to N or discard. - **Reproduction:** 1. Input or state: `` 2. Expected: `` 3. Actual: `` - **Suggested fix sketch (optional, one-liner):** NOT to be implemented in audit phase - **Verifier agent:** `` - **Verifier verdict:** CONFIRMED / PARTIAL (with revised repro) - **Verifier confidence:** high / medium / low — low requires re-verification with different agent - **Contract refs verifier read:** `` - **Mirror check needed in:** `` - **Status:** open --- ## Moderate (M) Definition: degraded but observable behavior, recoverable error path missing, edge case mishandled. --- ## Light (L) Definition: misleading log / error message, dev-only annoyance, minor input-validation gap. --- ## Needs Input (N) Definition: looks suspicious but requires user / spec clarification before classifying. ### N-1 - **File:** ... - **Question:** what specifically needs clarification - **Why blocked:** what doc would resolve it but doesn't exist or is ambiguous - **Status:** needs-input --- ## False Positives (discarded) - `:` — initially flagged as ``; discarded because `` ``` --- ## Cross-chunk mirror check (narrowly scoped) Mirror check fires ONLY when a confirmed bug crosses a contract boundary. Contract boundaries: - **Cache schema** (`docs/CACHE_CONTRACT.md`) - **Host RPC payload/response shape** - **Settings schema** (chrome.storage.sync.settings ↔ host alerts-config.json) - **ID normalization rules** shared between extension's `id-extract.js` and host's `host_normalize_id` and Python's `rcjav/ids.py` - **Fixture corpus expectations** (Python + Node consumers in `fixtures/`) When a bug entry hits one of those, add: ``` Mirror check needed in: ``` Default (no contract boundary touched) = no mirror check. Avoids spawning vague secondary audits. Final verification (Phase 3) scans every confirmed bug for `Mirror check needed in:` and runs the requested check. --- ## PHASE 1 — AUDIT ### Per-chunk goal ``` /goal bugs-.md exists in D:\DEV\Project\rclone-jav\, cites audit-snapshot-.md, contains every file in scope chunk reviewed for externally observable bugs, each bug has exact file:line citation, each bug has reproduction recipe (input/expected/actual), each bug verified by a fresh-context independent agent reading only cited contract docs, intentional patterns from "Known intentional patterns" list NOT flagged, no hedge language in confirmed bugs, bugs ranked S/M/L/N, mirror check noted where contract boundary touched, zero code changes made ``` Run the goal **once per chunk** (5 runs total). Do not batch. ### Verifier protocol For each candidate promoted from `bugs-candidates-.md`, spawn a NEW agent (fresh context, no audit-history visibility) with this exact framing: ``` Read : and the surrounding function ONLY. The claim is: . The supposed reproduction is: input , expected , actual . Contract refs to read before judging: . Reply with one of: CONFIRMED — bug is real, repro matches PARTIAL — symptom real, repro doesn't match exactly, suggest revised repro REFUTED — code does not ; here's the trace NEEDS-INFO — can't verify without ``` Verifier MUST NOT see: - Auditor's reasoning beyond the symptom/repro claim - Other candidates in this chunk - Other confirmed bugs in this or any other chunk - Audit-internal memory or chat history Otherwise it's a context-correlated rubber stamp, not independent verification. ### Stop conditions per chunk Restart the chunk with tighter framing if: - Verifier rejects > **30%** of confirmed-candidate attempts → "what counts as a bug" threshold is too loose - Candidate count exceeds **50 in one chunk** → scope too broad, split it - Auditor produces a finding flagged by an Intentional Pattern → re-read this doc --- ## PHASE 2 — FIX LOOP One bug at a time, starting at S-1 of the highest-priority chunk, then M-1, then L-1. Skip N (needs-input) until user resolves. ### Per-bug goal ``` /goal in is marked "fixed", the fix is applied at the cited file:line, the bug's reproduction recipe now returns Expected not Actual, no other bugs.md entries were changed, no unrelated code was modified, any tests covering the affected code still pass (or new test added if none existed), version bump applied if extension files touched ``` Replace `` with the actual ID (e.g. `S-1`). ### Fix verification gate Before marking `status: fixed`: 1. **Re-run the bug's reproduction recipe** — must now produce Expected, not Actual 2. **Per-file test re-run:** if `tests/` or `fixtures/` cover the affected file, re-run them, all must pass 3. **If no test existed for the now-fixed behavior:** write one, place under `tests/` or `fixtures/` 4. **If extension code changed:** bump `manifest.json` version (per `feedback_extension_version_bump.md` — one bump per user-requested update, visible reload-verification signal) 5. **Do NOT touch:** any other bug entry, any file marked DO NOT FIX in code comments, any intentional pattern listed above 6. Update the bug entry with `Status: fixed` and a `Fix:` line citing the new file:line of the change ### After completing all fixes in a chunk Run the chunk's **full test suite**, not just per-file tests. Catches cross-bug interactions (e.g. fix for S-1 in `rcjav/cache.py` interacts with fix for M-2 in `rcjav/dupes.py`). --- ## PHASE 3 — FINAL VERIFICATION ``` /goal all bugs in bugs-*.md files under D:\DEV\Project\rclone-jav\ are marked "fixed", "skipped" (with reason), or "needs-input" (awaiting user); D:\DEV\Project\rclone-jav\verification.md exists confirming a final audit of every modified file finds no new bugs introduced by the fixes; verification.md lists each fixed BUG-ID + its commit/edit and the repro-now-passes proof; every "Mirror check needed in:" entry resolved (either no mirror bug found, or new bug filed in target chunk); manifest.json version is incremented appropriately ``` ### verification.md format ```markdown # Verification — Original snapshot: audit-snapshot-.md Final snapshot: audit-snapshot-.md ## Fix summary - S-1 (bugs-python.md): fixed at . Repro now returns Expected (was Actual). Test added: . - M-1 (bugs-extension-bg.md): fixed at . Existing test still passes. - ... ## Mirror checks resolved - S-3 mirror in bugs-host.md: scanned `handle_search` for same contract issue, NOT present. - M-2 mirror in bugs-python.md: FOUND same issue → filed as M-7 in bugs-python.md, fixed at . ## Skipped - L-3 (bugs-host.md): skipped — `` (e.g. user decision, deferred to next audit) ## Needs input - N-1 (bugs-extension-options.md): awaiting user clarification on ## Final pass - Files modified during fix phase: - Independent re-audit of those files: , , found 0 new bugs / found new bugs (back to PHASE 1) - All `bugs-*.md` files: zero entries with status `open` - Extension manifest.json: version (bumped per shipped change) - All existing tests pass: - Fixture corpus runs: ``` --- ## ANTI-HALLUCINATION RULES (enforced — not optional) 1. **No bug without file:line** — line range only acceptable if symptom is genuinely multi-line 2. **No bug without reproduction recipe** with concrete input / expected / actual 3. **Verifier MUST be fresh-context** — same agent re-reading the claim is not independent 4. **Verifier reads only cited contract docs**, not the whole project memory pile — bounded context preserves independence 5. **One bug per fix session** — no batch fixes even for "obviously similar" findings 6. **DO NOT FIX banners + intentional patterns are untouchable** — listed in this doc + AGENTS.md / mockups 7. **Severity is criteria-based, not vibes-based** — Severe = data loss/crash/silent-wrong; Moderate = degraded observable; Light = misleading message / minor 8. **Forbidden hedge language in confirmed bugs:** "could be", "might", "potentially", "in theory", "may cause", "possibly". If you can't trace it concretely, demote to Needs Input or candidate scratch. 9. **No speculative race conditions** — race must have observable user-visible repro recipe, not just "concurrent code path exists" 10. **Reference contracts, not preferences** — bugs cite what code SHOULD do per a doc/comment/test, not what auditor thinks would be nicer 11. **No bug for missing feature** — that's a TODO, goes in `TODO.md` not `bugs.md` 12. **Phase 1 is read-only except audit docs** — see allowed-write list above --- ## Final-pass readability checklist (run before any audit) Before Phase 1 starts, re-read this doc and verify: 1. Every "intentional pattern" line has been verified against current code OR cites a current doc that exists right now 2. Any old memory/session claim that conflicts with current files has been removed or softened 3. Phase 1 allowed-write list is explicit and current 4. Candidates clearly separated from confirmed bugs (different files, different formats) 5. Verifier prompt includes `contract_refs:` and does NOT include auditor reasoning 6. Stop conditions are present (30% rejection, 50 candidates) 7. Mirror check scope is narrowly defined (contract boundaries only) 8. Excluded paths are current (no missing dirs, no dead refs) If any check fails, fix this doc before starting audit. --- ## NOTES - Run audit goals from the CLI project root: `cd D:\DEV\Project\rclone-jav && claude` — even when auditing extension files, output stays in this folder - Extension folder and CLI folder are separate git repos — verify with `git status` in each before audit so you're auditing a known snapshot - Per-project memory at `C:\Users\admin\.claude\projects\D--DEV-Project-rclone-jav\memory\` carries feedback rules — read those at audit start, they override default audit behavior - The extension repo currently has uncommitted modifications (hybrid state from codex's roadmap work + later edits). Snapshot captures this state; option (b) accepts dirty + records what was dirty. No auto-stash. --- ## Appendix — Recommended agent topology (Claude Code / multi-agent runners) This appendix is OPTIONAL — the plan above is portable to any `/goal`-style runner. If you're running it in Claude Code or a similar multi-agent tool, this section describes how to map the independence + parallelism requirements onto explicit agent calls. Operators using a different runner can ignore this appendix without losing the plan's structure. ### Role map **Main Coordinator** (the session you start the audit from) - Owns the snapshot file (`audit-snapshot-.md`) - Launches Chunk Auditor agents (parallel allowed) - Collects produced `bugs-candidates-.md` files - Launches Verifier agents per candidate (or small batch) - Promotes CONFIRMED / PARTIAL findings into `bugs-.md` - Drives Phase 2 fix loop one bug at a time - Launches Final Re-Audit agents in Phase 3 - The only role with write access to multiple files **Chunk Auditor Agents** (one per scope chunk) - Canonical agent type: `Explore` (read-only, fast) - Parallel allowed once snapshot is written - Inputs: chunk file list, snapshot ID, required-reading docs, this plan's "Known intentional patterns" + "Not on this list — let auditor surface" sections - Output: `bugs-candidates-.md` ONLY (no confirmed-bug writes; coordinator promotes) - Must cite file:line + candidate repro; hedge language permitted in candidates - **Must NOT:** edit product code, edit another chunk's candidate file, write to confirmed bug files **Verifier Agents** (fresh context per candidate, or small candidate batch from same file) - Canonical agent type: `Explore` (read-only, blind) - Fresh context — NO prior audit-history visibility - Inputs (and ONLY these): - `file:line` of the claim - Symptom (one sentence) - Reproduction recipe - `contract_refs:` list (max 3 docs) - **Must NOT see:** auditor reasoning, the candidate file as a whole, other candidates, other chunks' findings, this plan's hedge-language rules (verifier only verifies the specific claim) - Output: one of `CONFIRMED` / `PARTIAL` (with revised repro) / `REFUTED` (with code trace) / `NEEDS-INFO` (with what's missing) **Fix Phase Agent** (Phase 2) - Canonical agent type: main coordinator context OR a single write-capable `general-purpose` agent - Serial — one bug at a time - No parallel fixes even for "obviously similar" bugs - Inputs: the one bug entry being fixed, full file context, project memory - Outputs: code edits, bug entry status update, test additions if needed - Re-runs the bug's repro recipe and per-file tests before marking fixed **Final Re-Audit Agents** (Phase 3) - Canonical agent type: `Explore` (read-only) - One per modified-file group or per chunk that had fixes - Inputs: list of files modified during Phase 2, this plan - Output: confirmation of no new bugs introduced, OR new bug entries if found (which loop back to Phase 1) ### File-ownership rules (prevent merge collisions) - Each Chunk Auditor owns ONLY its own `bugs-candidates-.md` - Each Verifier writes nothing to disk — returns a structured response to the coordinator - Coordinator owns `bugs-.md`, `audit-snapshot-.md`, and `verification.md` - Fix Phase Agent owns the code files being edited + the bug entry being marked fixed - No two agents share write access to the same file at any time ### Parallelism rules - **Phase 1:** chunks may be audited in parallel ONLY after the snapshot is written. Parallel auditors must not edit product code or each other's output files. Coordinator dispatches all 5 chunk Agent calls in a single message for max throughput. - **Verifier dispatch:** within a chunk, verifiers for distinct candidates may run in parallel. Verifiers for candidates that cite the SAME file must run sequentially (avoids verifier-context cross-contamination if a verifier loads file context that affects another). - **Phase 2:** strictly serial. One bug per Agent call. No parallelism. - **Phase 3:** re-audit agents may run in parallel by file group. ### Canonical Agent tool calls (Claude Code specific) Coordinator-level pseudocode: ``` # Phase 1 — parallel chunk audit Agent(subagent_type="Explore", description="Audit chunk 1 Python CLI", prompt="") Agent(subagent_type="Explore", description="Audit chunk 2 native host", prompt="<...>") Agent(subagent_type="Explore", description="Audit chunk 3 ext SW+content", prompt="<...>") Agent(subagent_type="Explore", description="Audit chunk 4 ext options", prompt="<...>") Agent(subagent_type="Explore", description="Audit chunk 5 ext popup+bulk", prompt="<...>") # all 5 dispatched in one message → run in parallel # Phase 1 — verifier per candidate for candidate in bugs-candidates-.md: Agent(subagent_type="Explore", description=f"Verify {candidate.id}", prompt="") # Phase 2 — serial fix loop for bug in confirmed_bugs_sorted_by_severity: Agent(subagent_type="general-purpose", description=f"Fix {bug.id}", prompt="") # wait for completion, verify repro now passes, mark fixed # Phase 3 — final re-audit for modified_file_group in fix_phase_diff: Agent(subagent_type="Explore", description=f"Re-audit {group}", prompt="<...>") ``` ### Anti-correlation rules (preserve verifier independence) - Coordinator must NOT pass auditor reasoning to verifier — only the structured claim - Coordinator must NOT pass the candidate file's full text to verifier — only the one candidate's fields - Each verifier call is a fresh `Agent` invocation — never reuse a verifier agent across candidates - If a verifier rejects a claim, do NOT immediately re-verify with another agent hoping for CONFIRMED — that's correlation-chasing. Demote the candidate to REFUTED, log in candidates file, move on. - Track verifier rejection rate per chunk (see Stop Conditions). If rejection >30%, the auditor's threshold is wrong, not the verifiers'.