4.0 KiB
4.0 KiB
rc-jav (Python CLI)
Session memory for Codex. Read before making changes here.
What this is
A read-only rclone library comparison + search CLI. Compares cq:JAV remote (rclone crypt) against itself (dupe detection) or against external WinCatalog CSV/XML exports. Powers the rclone-jav Brave extension via native messaging.
Architecture
rc-jav.py
├── reads config.json (default_target etc.)
├── reads cache.json (per-remote file index, written by --scan)
├── shells out to: rclone lsf / rclone lsjson / rclone size --json
├── extract_id() per filename → normalized ID with optional #partN / variant suffix
├── two query modes: --quick (live rclone --include glob) and cached (uses cache.json)
└── output: rich tables (default) | --basic plain | --format json (for extension)
Files
D:\DEV\Project\rclone-jav\
├── rc-jav.py single-file CLI
├── config.json default_source/target/catalog (user-editable via --save)
├── cache.json scanned remote file index (written by --scan)
├── wincatalog\ drop WinCatalog CSV/XML exports here (auto-loaded)
├── TODO.md deferred work
└── README.md
Companion project
D:\DEV\Extensions\Production\rclone-jav\ (PC 1) / D:\DEV\Extensions\Staging\rclone-jav\ (PC 2) — Brave extension + native messaging host that shells out to rc-jav.py for searches.
ID normalization
extract_id()chops trailing single letters (e.g.IBW-902z.mp4→IBW-902). Decision is intentional — see extension's AGENTS.md "Decision log".- JAV IDs are canonicalized with at least 3 digits (
ABC-27→ABC-027); 4+ digit IDs keep their width (ABCD-1294). User expects real JAV IDs to beABC-027, neverABC-27orABC-0027. - Part suffix detection:
_1,-pt1,(1)→ appended as#partNfor distinctness. - Compound prefixes (
FC2-PPV-123) handled via secondary regex. - Search matcher does prefix lookup so
IBW-902finds bothIBW-902andIBW-902#part1etc. - Quick search must emit only canonical padded uppercase globs (
ABC-027*,ABCDE-1167*). Do not add--ignore-case; user never uses lowercase filenames and it caused noticeable delay.
Defaults from earlier sessions
cq:JAVis the current remote root (after the rclone crypt config change moved it down a level)default_targetin config.json =["cq:JAV"]human_size()formats to 2 decimals (e.g.6.94 GiB)- After the 3-digit ID canonicalization change, run
python rc-jav.py --scanto rebuildcache.jsonunder the new padded keys. - Duplicate KEEP ranking uses configurable VIP folders before source/size/format ranking. Default VIP folder is
ClearJAV; video files there are treated as the trusted direct-rip copy. - Duplicate KEEP ranking treats
.tsas the lowest-priority video container when any non-.tsduplicate is available.
Recent decisions / bug fixes
--format jsonshould keep stdout as clean JSON. Status/progress text belongs on stderr in JSON mode.- Catalog rows are informational. CSV exports mark them as
CATALOG; JSON exports put them undercatalog, notdelete_candidates. - Cache loading validates the top-level shape and falls back to an empty cache when
remotesis missing or malformed. - The old
--recursive/-Rflag was removed because scans are always recursive (rclone lsf -R/ quicklsjson -R).
TODO
See TODO.md for deferred work.
When making changes
- Adding CLI flags: also update host invocation in
D:\DEV\Extensions\Production\rclone-jav\host\rcjav-host.pyif the flag matters to the extension - Changing
extract_id()semantics: forces a--scanto rebuild cache under new keys, and may need a parallel change in extension'snormalizeId() - JSON output format changes: extension's popup.js / overlay rendering reads
structuredarray — keep field names stable (source,remote,path,full_path,size,size_human,mod_time,jav_id) - Config schema: update
--savewriter and any defaults