f7fc15b17c
Includes: - cli.py path fix (parents[1]) for config/catalog resolution - Library cleanup feature design docs (TODO.md, mockup) - Audit + bug-queue markdowns from May 2026 reliability pass - .gitignore expanded for transient artifacts
6.7 KiB
6.7 KiB
Candidate Findings — Python CLI — audit-snapshot-2026-05-24T15-55Z.md
Scope: rc-jav.py, rcjav/*.py, tests/test_rules.py, fixtures/run.py
Required-reading docs read: AGENTS.md / TODO.md / bug-audit-plan.md (Note: CACHE_CONTRACT.md does not exist; docs/ folder is absent.)
Auditor: fresh Explore agent
Candidate C-1
- File: D:\DEV\Project\rclone-jav\rcjav\rclone_io.py:66
- Hunch: Accessing item["Path"] on rclone lsjson output may raise KeyError.
- Trace: quick_search_remote() at line 66 uses direct dict access item["Path"] without .get() fallback. If rclone output is malformed or omits Path, KeyError crashes the scan.
- Question for verifier: Should line 66 use item.get("Path") like line 77 does for Size/ModTime?
- Suggested severity: M
- Contract refs needed: none
Candidate C-2
- File: D:\DEV\Project\rclone-jav\rcjav\library.py:257
- Hunch: Direct dictionary access f["path"] in find_library_issues() may raise KeyError on corrupted cache.
- Trace: find_library_issues() accesses f["path"] without .get(). Cache is written with path/size/mod_time/jav_id keys but no validation ensures all entries have these keys. Corrupted/legacy caches could be missing path.
- Question for verifier: Should line 257 use f.get("path") to handle missing keys gracefully like --reextract does at line 524?
- Suggested severity: M
- Contract refs needed: none
Candidate C-3
- File: D:\DEV\Project\rclone-jav\rcjav\library.py:328-330
- Hunch: Direct dict access f["path"] and f["jav_id"] assumes cache entries are well-formed without validation.
- Trace: rename_file_in_remote() at line 328-330 uses direct key access. Line 330 tries fallback with "or f["jav_id"]" but would crash on line 328 if f["path"] is missing. Corrupted cache entries could cause KeyError.
- Question for verifier: Should these lines use f.get() with fallback instead of direct bracket access?
- Suggested severity: M
- Contract refs needed: none
Candidate C-4
- File: D:\DEV\Project\rclone-jav\rcjav\cli.py:186-189
- Hunch: save_config() lacks Windows file-locking retry logic that save_cache() has.
- Trace: save_config() calls os.replace() without PermissionError handling. If Windows locks config.json, the replace fails. save_cache() (line 142-147) has explicit PermissionError handling with 0.5s retry. --save could report success but file write fails silently on Windows.
- Question for verifier: Should save_config() include the same PermissionError + retry as save_cache()?
- Suggested severity: M
- Contract refs needed: none
Candidate C-5
- File: D:\DEV\Project\rclone-jav\rcjav\cli.py:131
- Hunch: DEFAULT_CATALOG path computed at module-load time; could resolve incorrectly if cwd differs.
- Trace: DEFAULT_CATALOG is set on line 131 using Path(file).resolve().parents[1] at import time. If rc-jav.py invoked from different cwd (Task Scheduler, cron), path resolution might be affected by symlinks or relative-path assumptions.
- Question for verifier: Does DEFAULT_CATALOG resolve to correct wincatalog/ across all invocation contexts?
- Suggested severity: L
- Contract refs needed: AGENTS.md
Candidate C-6
- File: D:\DEV\Project\rclone-jav\rcjav\dupes.py:105-107
- Hunch: best_priority could be None if no entries match priority folders, masking misconfiguration.
- Trace: Line 105 builds prioritized list. Line 106 sets best_priority=None if empty. Line 107 filters for rank==None which yields empty list. Falls through to fallback, but absence of warning could hide config error.
- Question for verifier: Should a warning be logged when no duplicates match configured priority_folders?
- Suggested severity: L
- Contract refs needed: AGENTS.md
Candidate C-7
- File: D:\DEV\Project\rclone-jav\rcjav\cli.py:797
- Hunch: Global mutation of DEFAULT_CATALOG/DEFAULT_SOURCE/DEFAULT_TARGET could cause reference bugs.
- Trace: Lines 438-440 reassign global DEFAULT_* from config.json. Line 797 passes mutated DEFAULT_CATALOG to _expand_catalog_paths(). Works correctly but the global-mutation pattern is fragile and could break if code is refactored.
- Question for verifier: Is the global reassignment pattern intentional, or should these be passed as parameters instead?
- Suggested severity: L
- Contract refs needed: AGENTS.md
Candidate C-8
- File: D:\DEV\Project\rclone-jav\rcjav\ids.py:206-207
- Hunch: normalize_id() appends dummy extension; could fail on input with embedded dots.
- Trace: normalize_id() adds ".x" to call extract_id(). If input is "ABC-001.backup", stem operation treats .backup as extension, breaking the ID. Unlikely in practice but contract not clearly documented.
- Question for verifier: Should normalize_id() validate input format or handle embedded-dot cases?
- Suggested severity: L
- Contract refs needed: AGENTS.md
Candidate C-9
- File: D:\DEV\Project\rclone-jav\rcjav\rclone_io.py:293
- Hunch: _stderr_thread.join() has no timeout; could hang if stderr thread deadlocks.
- Trace: Daemon thread reads stderr on line 231-235. Line 293 calls join() without timeout. If thread hangs, main thread blocks indefinitely. The timeout handling in cancel logic (lines 270, 284) uses proc.wait(timeout=3).
- Question for verifier: Should _stderr_thread.join() include a timeout?
- Suggested severity: L
- Contract refs needed: none
Summary by Severity
- Moderate (M): 4 candidates — KeyError risks in cache/rclone access, Windows file-locking issue
- Light (L): 5 candidates — Path resolution edge case, global mutation, retry logic, normalize_id contract, thread join timeout
- Severe (S): 0
- Needs Input (N): 0
Top 3 by risk:
- C-1: KeyError on rclone output could crash scan in quick mode
- C-2: KeyError on cache.path could crash library-issues scan
- C-4: Config write failure on Windows could silently corrupt config.json
VERIFIER NOTES (Phase 1 Moderate verification, stricter prompt)
- C-1 (rclone KeyError on Path) — REFUTED. rclone lsjson contract guarantees Path. Direct access appropriate fail-fast.
- C-2 (library cache KeyError) — REFUTED. CACHE_CONTRACT.md + load_cache validation + FileEntry dataclass triple-guarantee path key. cli.py:526 .get pattern is for un-validated --reextract direct read.
- C-3 (rename_file KeyError) — REFUTED. Auditor conflated scalar caller args with iterated dict entries. f comes from cache (contract-guaranteed).
- C-4 (save_config no retry) — CONFIRMED M, high confidence. Promoted as M-1 in bugs-python.md. Real asymmetry vs save_cache.
CHUNK 1 CALIBRATION:
- Severe: 0 (none flagged)
- Moderate rejection: 3/4 = 75%
- Combined: 3/4 = 75% (stop condition >30% triggered)
- Auditor weakness: KeyError pattern-matching without upstream contract check
- L candidates NOT verified per stop condition. Same auditor weakness likely affects L list. Revisit only if needed.