Files
ext-rclone-jav/AGENTS.md
T

170 lines
12 KiB
Markdown

# rclone-jav (Brave extension + native messaging host)
Session memory for Claude. Read before making changes here.
## Architecture
```
Brave tab title -> content script extracts JAV ID
-> background.js connectNative("com.rcjav.host")
-> host/rcjav-host.bat (portable: py launcher or python on PATH)
-> host/rcjav-host.py
-> subprocess python rc-jav.py --search ID --basic --no-color --format json
-> structured hits back through native port
-> popup or in-page overlay
```
Two separate codebases:
- This repo: Brave extension + native messaging host.
- `D:\DEV\Project\rclone-jav\` — Python rc-jav CLI. The host shells out to `rc-jav.py` here.
## Folder layout (post-rename)
```
D:\DEV\Extensions\Production\rclone-jav\ (PC 1)
D:\DEV\Extensions\Staging\rclone-jav\ (PC 2)
├── manifest.json
├── background.js
├── content.js
├── popup.{html,js,css}
├── options.{html,js}
├── host\
│ ├── rcjav-host.py
│ ├── rcjav-host.bat (portable: py launcher fallback)
│ ├── install-host.ps1 (self-elevates to HKLM)
│ ├── register-host.bat (prompts for ID, calls install-host.ps1)
│ ├── com.rcjav.host.json (generated; UTF-8 NO BOM)
│ └── (logs)
└── docs\
├── INSTALL.md (gotcha table at the bottom)
└── README.md
```
## Critical gotchas (learned the hard way)
| Symptom | Cause | Fix |
|---|---|---|
| "Specified native messaging host not found" | UTF-8 BOM in com.rcjav.host.json | `WriteAllText` with `UTF8Encoding($false)` |
| Same error after registering HKCU | Brave on Windows ignores HKCU on some installs | Register HKLM too. `install-host.ps1` does both. |
| Host launches then disconnects | Python text-mode stdio mangles 4-byte length prefix | `msvcrt.setmode(stdin/stdout, O_BINARY)` at host startup |
| Host log says "stdin closed, exiting" immediately | bat-side stderr leak corrupts protocol | `python -u` + redirect stderr to log file |
| `Missing closing '}'` in install-host.ps1 | Em-dashes in comments + LF endings + Windows PS 5.1 (cp1252 fallback) | Strip em-dashes from .ps1 files, or save with BOM, or use pwsh |
| Brave reload != Brave restart | NM cache survives extension reload | Kill all brave.exe processes then reopen |
| `IBW-902z` page title fails to parse | `\b` after `\d` blocked by following word char | Extension regex uses `[a-zA-Z]?\b` trailing — captured but discarded |
| Delete safety too broad | Allowlist reduced `cq:JAV` to `cq:` | Match full configured prefixes, not remote roots |
| Overlay feels ~1.5s late on SPA pages | `SPA_SETTLE_MS` waits before auto-check | Current value is 800ms; tune carefully if detection gets flaky |
## Internal names — keep as-is
- Native messaging host: `com.rcjav.host` (NOT renamed despite extension rename)
- Window flag in content.js: `__rclonex_loaded__` (idempotency guard for content script re-injection)
- CSS IDs starting with `rclonex-` (overlay)
- Host logs: `host/logs/rcjav-host.log`, `host/logs/rcjav-host-events.log`, `host/logs/rcjav-host-stderr.log`, `host/logs/deletes.log`
- Host scan progress state: `host/state/scan-state.json`
Don't rename these unless there's a real reason. They're orthogonal to the user-facing extension name.
## Settings
Stored in `chrome.storage.sync` under key `settings`. Per-extension-ID namespacing → if extension is reloaded under a different path, settings are wiped.
**Backup/restore lives in Options → Setup & Transfer** — JSON export/import to survive reloads or PC migrations. Use it before renaming or relocating the extension.
DEFAULT_SETTINGS lives in background.js. Keep in sync with options.html defaults.
## Decision log
### Deletion allowlist uses full prefixes (2026-05-20)
**Decision:** host delete allowlist must use full configured path prefixes (`cq:JAV`, trash dir, etc.), not only remote roots like `cq:`.
**Reasoning:** Reducing `cq:JAV` to `cq:` lets any path on the same rclone remote pass the safety check. Deletion is opt-in but must be tightly scoped.
**Important:** extension delete calls must forward `rcjav_path`, or the host may read the wrong `config.json` and derive the wrong allowlist.
### Toolbar popup setting gates auto-check (2026-05-20)
**Decision:** `triggers.toolbarClick` does not remove the MV3 popup, but it does gate whether the popup auto-runs `checkTab` on open. If disabled, popup stays idle until user clicks Re-Scan.
### Quick search and ID padding (2026-05-20)
**Decision:** rc-jav canonical JAV IDs use at least 3 digits (`ABC-027`) and preserve 4+ digit IDs (`ABCD-1294`). Quick search emits canonical uppercase globs only.
**Reasoning:** user clarified real JAV filenames are never `ABC-27` or `ABC-0027`; they are `ABC-027`. User also never uses lowercase filenames, so quick search should not use rclone `--ignore-case` because it added noticeable delay.
**Operational note:** this changes cache keys. Run `python rc-jav.py --scan` in `D:\DEV\Project\rclone-jav` after this change.
### No-match overlay metadata (2026-05-20)
**Decision:** host search response includes `cache_meta` and `scanned_remotes` from rc-jav JSON so no-match overlays can show what was scanned instead of falling back to "library".
### IBW-902z trailing letter (2026-05-20)
**Decision:** minimal regex fix in extension only. NOT a full variant-suffix rewrite of the index.
**Reasoning:** User's library uses one ID per number (either `IBW-902` OR `IBW-902z`, not both). Page titles failing on `IBW-902z` is the real bug. Extension regex now matches optional trailing letter and discards it. rc-jav's index continues to strip trailing letters at extract_id time. Effective: extension queries `IBW-902` for any title `IBW-902` or `IBW-902z`, finds the file regardless of how it's named on rclone.
**Revisit if:** both `IBW-902.mp4` and `IBW-902z.mp4` ever coexist in library — they'd collide on the same ID. Then implement variant suffix (#var_Z) end-to-end.
### Native messaging host name stayed `com.rcjav.host`
When extension was renamed `rclonex``rclone-jav`, the NM host name was NOT renamed. Reason: zero user impact (it's an internal identifier in registry/manifest), but every rename costs registry rewrites + script churn. Not worth it.
### WinCatalog backslash normalization
Done in rc-jav catalog loading. Catalog CSV/XML paths are normalized from Windows `\` to rclone-style `/` before the extension sees them.
## When making changes
- Extension settings schema change → update `DEFAULT_SETTINGS` in background.js AND defaults in options.html + options.js load()
- New native messaging action → handler in rcjav-host.py + DISPATCH map + extension code that sends it
- New options pane → sidebar item in options.html + new `.pane` div + load/save bindings in options.js
- Any rc-jav.py CLI change → host invocation in rcjav-host.py handle_search must keep pace
---
## Console consolidation refactor — execution status
**Spec / blueprint:**
- `D:\DEV\Project\rclone-jav\mockups\console-consolidation-claude.html` (refactor spec — decision table, sequence, acceptance criteria)
- `D:\DEV\Project\rclone-jav\mockups\console-consolidation-options.html` (Codex's visual annotation variant)
**Shipped (in execution order):**
1. **Sim Dupe deleted from popup.** Button + click handler removed from `popup.html` / `popup.js`. Payload preserved in `samples/sim-dupe.js` for future layout work.
2. **CSS extracted from options.html.** Embedded `<style>` block moved to `options.css`, linked via `<link rel="stylesheet">`. options.html went 1179 → 794 lines. Inline `style="..."` attributes intentionally left for later (step 6 territory).
3. **Transfer Assistant wizard deleted.** "Setup & Transfer" pane renamed to "Setup". Replacement: Extension ID display + Copy button added to Diagnostics → Native host registration fieldset (always visible, not failure-gated). Sidebar entry, fieldset, modal, and ~107 lines of JS removed.
5. **Recent Activity + Search Troubleshooting moved to new Debug Tools pane.** Verified Recent Activity is search-trigger-only by reading `background.js``recordActivity()` is NOT called from `delete-file` handler. No audit-value split needed. New sidebar entry "Debug Tools" under System group; new `pane-debug` houses both fieldsets.
(Step 4 in the plan is a paired-extraction sub-task of step 6; not a separate ship.)
**Pending (in execution order):**
- **Step 6 — options.js split, Cache & Scans + Duplicate Review paired.** Biggest, riskiest step. `options.js` is currently 3133 lines. Pair these two together because Dup Review reads cache state — extracting one while the other stays in monolith creates cross-module gap. Continue with Debug Tools, Library Issues, Settings sub-tabs after the pair lands.
- **Step 7a — Bulk Check standalone window.** New `bulk-check.html` opened as detached `chrome.windows.create({ type: 'popup', width: 640, height: 540 })` from a "Bulk Check" launcher button in the popup. Single canonical entry path — NOT a Console sidebar tab. Window dedup via `chrome.storage.session`, last-paste persisted via `chrome.storage.local`.
- **Step 8 — Shared fixture corpus.** Top-level `D:\DEV\Project\rclone-jav\fixtures\` (neutral location, NOT inside Python or extension repo). JSON cases for query-ID extraction (extension), filename ID extraction (Python), shared normalization.
- **Step 9 — Cache contract design.** CACHE_VERSION already exists (currently 3). Add ID_RULES_VERSION concept: schema bump = force rebuild, rules bump = warn-and-mark-stale.
- **Step 10 — `rc-jav.py` module split** into `rcjav/` package (ids, cache, dupes, catalog, rclone_io, output, cli). Keep `rc-jav.py` as thin entrypoint that imports from `rcjav.cli.main`.
- **Step 11 — Host fast-path benchmark and decide.** Measure popup search latency under (a) idle Python and (b) Python actively scanning. If host fast path is the only thing keeping popup responsive under scan = narrow to dict lookup only and document. If not needed = delete entirely.
**Architecture (locked — do not relitigate):**
- Sidebar = Console / Settings / Support tri-split. No dashboard pane. Status carried by badges on tab labels (`Duplicate Review [27]`, `Cache & Scans [28m]`, `Library Issues [4]`).
- Default landing = Duplicate Review.
- Bulk ID Check = detached `chrome.windows.create` popup, NOT a Console sidebar tab. Single canonical entry path = popup launcher button.
- Keep Ranking Rules nested INSIDE Duplicate Review as a sub-tab, NOT a separate Settings tab.
- Sim Dupe: deleted from extension. Repo HTML harness in `samples/` only.
- Transfer Assistant: deleted. Diagnostics' Native host registration fieldset is the replacement (Extension ID copy + Repair Registration + Verify Registration buttons).
- Vanilla JS + ordered `<script>` tags. No framework, no build system.
- Inline rule tests stay next to rule editors (Matching Rules, Site Extraction). Standalone benches go to Debug Tools.
**Notes:**
- Repo is NOT git-initialized. Rollback for shipped steps = manual restore from this conversation's diffs. Worth running `git init` in this folder before step 6 (the big one) for safer iteration.
- Three pre-execution handoffs from the original plan have been resolved:
- Recent Activity scope test → settled by code read (single role, all to Debug).
- Diagnostics replacement for Transfer wizard → present (Extension ID, Repair, Verify all visible in one fieldset).
- Popup launcher button label → defer until step 7a; text + emoji currently in mockup.
If a future session wants to continue: read this status block + open the mockup HTML files for the full spec. Resume on step 6.