Sync working tree before initial Gitea push

Includes:
- cli.py path fix (parents[1]) for config/catalog resolution
- Library cleanup feature design docs (TODO.md, mockup)
- Audit + bug-queue markdowns from May 2026 reliability pass
- .gitignore expanded for transient artifacts
This commit is contained in:
admin
2026-05-26 22:35:42 +02:00
parent 8d6bdb81af
commit f7fc15b17c
24 changed files with 2938 additions and 41 deletions
+114
View File
@@ -0,0 +1,114 @@
# Bug Report — Extension SW + content + manifest — audit-snapshot-2026-05-24T15-55Z.md
Snapshot: audit-snapshot-2026-05-24T15-55Z.md
Required-reading docs read: ext AGENTS.md / mockup / bug-audit-plan.md / project memory
Auditor agent: fresh Explore agent (chunk 3 auditor)
Verifier agents: fresh Explore agents per candidate, blind context
This file contains CONFIRMED + PARTIAL findings only. Candidate scratch lives in `bugs-candidates-extension-bg.md`. REFUTED / NEEDS-INFO candidates stay in scratch with verifier response appended.
**Chunk 3 calibration note:** S+M verification yielded 4 confirmed bugs with 50% rejection rate. Auditor over-claimed by missing platform API contracts (Chrome connectNative keepalive, contextMenus contract, storage.local atomicity scope, Object.assign argument order, content.js function definitions). Light candidates were NOT verified per audit-plan stop condition. Revisit chunk 3 L only if needed; see `bugs-candidates-extension-bg.md`.
**Cross-chunk re-rank note:** Per `bugs-fix-queue.md`, this chunk's original severity labels were normalized against other chunks. Changes:
- Original S-1 (recordRpc race) → **M-6** in the queue. Demoted because it's diagnostic-log loss, not user-data loss.
- Original M-1 (maybeNotifyHostError rate-limit race) → **L-1 in the queue** (renumbered locally as L-2 below to avoid colliding with prior L-1 Discord). Demoted because over-notification is annoying but recoverable and self-corrects after 10 min.
- M-2 (context menu after SW eviction) → unchanged, kept M (queue M-2).
---
## Severe (S)
Definition: data loss/corruption · wrong remote operation · persistent broken workflow no recovery · silent success when operation actually failed.
(none in this chunk after re-rank)
---
## Moderate (M)
Definition: operation fails/hangs but user can retry · wrong persisted settings · diagnostic loss that materially blocks investigation · modal/workflow stuck until manual recovery · race causing stale/wrong visible results.
### M-2 (queue) — Context menu missing after MV3 SW eviction
- **File:** `D:\DEV\Extensions\Production\rclone-jav\background.js:766-782` (ensureContextMenu) with callsites at `:1019` (settings-changed), `:1178` (onInstalled), `:1179` (onStartup)
- **Symptom (one sentence):** After the MV3 service worker evicts (~30s idle) and a new SW boots from a non-install/non-startup trigger (toolbar click, alarm, message), Chrome has no contextMenus registered and the user's "rclone-jav: Scan" / "rclone-jav: Search ..." entries silently disappear from right-click menus.
- **Why it's a bug:** Per Chrome MV3 contract, `chrome.contextMenus` entries DO NOT persist across SW lifecycle boundaries — they must be re-created on each SW boot. `ensureContextMenu` is only invoked from: `onInstalled` (install/update), `onStartup` (browser boot), and the `settings-changed` message handler. None of these fire on routine SW evict→wake cycles.
- **Reproduction:**
1. Install extension. Right-click any page → context menu items present ✓
2. Leave Brave idle for >30s with no extension activity. SW evicts.
3. Click anything that wakes the SW NOT via onInstalled/onStartup/settings-changed (toolbar icon, alarm, content-script message). New SW boots.
4. Expected: right-click context menu items still present
5. Actual: items missing — must reload extension OR change a setting to restore
- **Suggested fix sketch:** call `ensureContextMenu()` at top-level module init in background.js (runs every SW boot)
- **Verifier verdict:** CONFIRMED — very high confidence (99%)
- **Contract refs verifier read:** Chrome MV3 contextMenus lifecycle
- **Mirror check needed in:** any other Chrome API state that must be re-registered per SW boot — chrome.alarms persistent, chrome.commands manifest-declared. contextMenus is the outlier.
- **Status:** fixed
- **Fix:** `D:\DEV\Extensions\Production\rclone-jav\background.js:1193` — added top-level `ensureContextMenu();` call at module init scope (NOT inside any addListener / event handler). This runs on every SW evaluation: install, browser startup, idle wake, alarm wake, message wake — covering all paths the prior listener-bound calls missed. Existing onInstalled/onStartup listeners kept as defensive backup; `ensureContextMenu` calls `chrome.contextMenus.removeAll` first, so duplicate invocation is idempotent. Manifest bumped 0.1.35 → 0.1.36. JS syntax verified via `node --check`. Code-trace proof of placement: line 1193 is at module scope (preceded only by other top-level statements like addListener registrations); fires unconditionally on every fresh SW evaluation before any user-event handler. Runtime repro requires user test (reload extension → verify context menu appears → wait 30+ s for SW idle → trigger SW wake via toolbar icon or content script message → right-click any page → expect context menu items still present without needing reload).
### M-6 (queue) — recordRpc read-modify-write race loses log entries
**Re-ranked from chunk S-1 to queue M-6 (diagnostic loss, not user data loss).**
- **File:** `D:\DEV\Extensions\Production\rclone-jav\background.js:155-169` (recordRpc), callsites at `:143`, `:318`, `:330`, `:343`, `:359`
- **Symptom:** When the native port disconnects with multiple inflight requests, the rolling RPC log loses entries because all pending rejects + the disconnect marker call `recordRpc` concurrently and each does non-atomic get-then-set on the same storage key.
- **Why it's a bug:** `recordRpc` is `async` but callers fire fire-and-forget. When `port.onDisconnect` rejects every pending entry in the same tick, each reject wrapper calls `recordRpc` concurrently. All read same `old` array, all set `[newEntry, ...old]`, last set wins. Chrome storage.local has no atomicity guarantee.
- **Reproduction:**
1. Native port disconnects while 3+ requests are inflight (host killed by AV during Check Library batch)
2. Expected: all 3+ rejected requests + `__port_disconnect__` marker land in `chrome.storage.local[NATIVE_LOG_KEY]`
3. Actual: only one entry persists; the others silently disappear. Diagnostics → Native messaging log shows misleading picture exactly when user is investigating an outage.
- **Suggested fix sketch:** wrap recordRpc body in `let _rpcLogLock = Promise.resolve(); _rpcLogLock = _rpcLogLock.then(async () => { ... })` chain. Same pattern user already applied to `_rcjavTrace` in tabvault.
- **Verifier verdict:** CONFIRMED — high confidence
- **Contract refs verifier read:** Chrome storage.local API (no atomicity)
- **Mirror check needed in:** options.js settings save flow, options-library-issues.js cache writes, activity log buffer, tabvault caller log (out-of-scope)
- **Status:** fixed
- **Fix:** `D:\DEV\Extensions\Production\rclone-jav\background.js:155-180` — wrapped recordRpc body in promise-chain lock (`_rpcLogLock = _rpcLogLock.then(async () => { ... })`). Read-modify-write on `chrome.storage.local[NATIVE_LOG_KEY]` now serializes — concurrent callers chain instead of racing. Pattern mirrors tabvault `_rcjavTrace` lock and the M-2-follow-up ensureContextMenu lock for the same storage race class. `maybeNotifyHostError(entry)` still runs OUTSIDE the lock (its own rate-limit storage race is tracked separately as L-1 in the queue; not fixed here per one-bug-per-session rule). Manifest bumped 0.1.41 → 0.1.42. JS syntax verified via `node --check`. Lock mechanics smoke-tested in isolation with simulated chrome.storage.local (5 ms artificial latency on get/set, 5 concurrent writes): UNLOCKED variant stored only 1 of 5 entries (race confirmed); LOCKED variant stored all 5 entries in correct newest-first order. Mirror checks for options.js / options-library-issues.js storage writes deferred to Phase 3 final verification per audit plan.
---
## Light (L)
Definition: confusing UI · cosmetic stale state · diagnostic annoyance · non-blocking alert issue · two-click recoverable.
### L-1 (queue) — maybeNotifyHostError rate-limit get-then-set race
**Re-ranked from chunk M-1 to queue L-1.**
- **File:** `D:\DEV\Extensions\Production\rclone-jav\background.js:188-193`, callsites via `recordRpc` at `:173`
- **Symptom:** During a host outage burst (port disconnects with 2+ inflight requests), the 10-minute rate-limit on Discord/notification alerts can fire 2-3 alerts within the same window because the get-then-set on `HOST_ALERT_KEY` is non-atomic.
- **Why it's a bug (demoted from M to L):** Same race pattern as M-6, but the impact is over-notification not data loss. User receives extra alerts during one outage event — annoying but informative. Self-corrects after 10-min window. Not blocking. Not stuck workflow.
- **Reproduction:**
1. Port disconnects with 3 inflight requests
2. Expected: 1 alert per 10-min window
3. Actual: 3 alerts for the same incident
- **Suggested fix sketch:** wrap get-then-set in Promise lock (same as M-6 fix; can share the lock)
- **Verifier verdict:** CONFIRMED — high confidence
- **Mirror check needed in:** same as M-6
- **Status:** fixed
- **Fix:** `D:\DEV\Extensions\Production\rclone-jav\background.js:191-247` — added dedicated `_hostAlertLock` Promise-chain (NOT shared with `_rpcLogLock` per codex's note — different storage key, different invariant). Entire maybeNotifyHostError body now runs inside the lock: rate-limit read/check/write of `HOST_ALERT_KEY`, plus the notification create and Discord post that follow. Concurrent calls in the same tick (5+ pending requests rejected on onDisconnect) now properly chain — first caller writes the new lastTs, subsequent callers see the fresh ts and bail at the check. Manifest bumped 0.1.42 → 0.1.43. JS syntax verified via `node --check`. Lock + rate-limit smoke-tested in isolation with simulated chrome.storage.local (5ms latency): UNLOCKED → 5 of 5 concurrent calls fire alerts (bug confirmed); LOCKED → 1 of 5 concurrent calls fires (correct); LOCKED + 5 sequential within rate-limit window → 1 alert (rate-limit still enforced after the lock change).
### L-2 (queue, was chunk L-1) — Discord post failures have no passive UI surface
- **File:** `D:\DEV\Extensions\Production\rclone-jav\background.js:230-273` (postDiscordAlert), status write at `:265-271`
- **Symptom:** Discord webhook failures are persisted to `chrome.storage.local.lastDiscordSend` but only visible by clicking Test buttons — no passive page-load display.
- **Why it's a bug (originally L):** Diagnostic data not lost, just not surfaced passively. UX visibility gap.
- **Suggested fix sketch:** on Setup pane render, read `lastDiscordSend` and show "Last alert: <ts> · ok|FAILED <reason>"
- **Verifier verdict:** PARTIAL — symptom real, original "silent failure" framing wrong
- **Status:** open
---
## Needs Input (N)
(none)
---
## False Positives (discarded)
- `background.js:90, 100-114, 120-148, 307-365` — flagged as Severe "pending Map orphaned on SW eviction mid-call". REFUTED via Chrome `connectNative` keepalive contract: an open port keeps the MV3 SW alive; if port closes, `onDisconnect` rejects all pending (line 139) — no orphans. `pulseKeepalive` is defensive redundancy. Caveat: if Brave observed diverging, would become Brave-specific bug — not verified.
- `background.js:62-76` (mergeSettings) — flagged as Moderate. REFUTED. Auditor misread `Object.assign({}, dv, sv)` — defaults go FIRST so missing keys fill from defaults.
- `background.js:895-905` (contextMenu tab.id null) — flagged as Moderate. REFUTED via Chrome contextMenus contract: registered contexts guarantee non-null tab.id. `extractIdFromTab` also has defensive null check.
- `content.js` (escapeOverlay undefined) — flagged as Moderate. REFUTED. Function IS defined at content.js:451. Auditor missed it.