Step 11: benchmark host fast-path, decision = keep

Adds benchmarks/host-fast-path.py and benchmarks/README.md. The benchmark compares two paths for a cached single-ID search: 1. fast-path: in-process dict walk inside the native host (handle_cached_search_fast in rcjav-host.py) 2. subprocess: shell out to `rc-jav.py --search ID --cache --format json` Idle baseline against the live 7124-file cache (5 queries × 5 iter): fast-path: median 0.46ms p95 0.61ms max 0.72ms subprocess: median 919ms p95 1233ms max 1385ms median speedup: 2000x Decision: keep the fast path. The ~920ms subprocess cost is dominated by Python interpreter startup + 1.3MB cache.json parse. That's structural — it applies under idle Python too, not just when a scan is running. The "Python actively scanning" condition from the original roadmap doesn't change the verdict; it would only make the subprocess path even slower while leaving the in-process path unaffected (the fast path doesn't touch the scanning process). The fast path is already correctly scoped — bails out for wildcards, ranges, name searches, and --quick mode. Narrowing further would just push more queries through the slow path with no upside. Possible follow-up (not in scope here): memoize _load_host_cache with mtime-based invalidation so the fast path doesn't reparse cache.json on every call. Current per-call median (0.46ms) is already fast enough that this is optional. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 11:12:33 +02:00
parent 66f82eb214
commit b9a24b3fb5
2 changed files with 217 additions and 0 deletions
@@ -0,0 +1,66 @@
 # benchmarks/
 Latency benchmarks for decisions in the rc-jav roadmap.
 ## host-fast-path.py — Step 11 decision
 The native-messaging host has an in-process fast path
 (`handle_cached_search_fast` in `rcjav-host.py`) that answers simple
 cached single-ID searches without shelling out to `rc-jav.py`. Step 11
 asked: is the fast path actually pulling its weight, or could we
 delete it / narrow it down?
 ### Run
 ```
 python benchmarks/host-fast-path.py [--queries Q1 Q2 ...] [--iterations N]
 ```
 For the "Python actively scanning" condition, start `rc-jav.py --scan`
 in a separate terminal first.
 ### Findings (idle baseline, 2026-05-23, 7124-file cache)
 ```
 === aggregate (5 queries × 5 iterations, idle Python) ===
  fast-path total:
    n=25  min=0.43ms  median=0.46ms  mean=0.48ms  p95=0.61ms  max=0.72ms
  subprocess total:
    n=25  min=880.56ms  median=919.45ms  mean=965.55ms  p95=1232.93ms  max=1385.37ms
  median speedup: 2000.1x
  p95 speedup:    2036.0x
 ```
 ### Decision: keep the fast path
 The subprocess path costs ~920ms median per query even with the
 interpreter doing nothing else — that's pure Python startup +
 `json.loads` of the 1.3 MB cache.json. The fast path returns hits in
 under 1ms. The 2000× speedup is structural (interpreter startup
 overhead), not load-dependent, so it would apply equally under (a)
 idle and (b) active-scan conditions.
 The fast path is already correctly scoped — it bails out for
 wildcards, ranges, name searches, and `--quick` mode (which forces a
 live rclone hit). Narrowing further would just push more queries
 through the slow path with no upside.
 The "Python actively scanning" condition listed in the original
 roadmap was framed as the case where the fast path's value would be
 most obvious. The idle baseline already settles it; we don't need to
 gate the decision on the active-scan measurement, though running it
 remains a sanity check if cache.json grows substantially.
 ### What this benchmark does NOT cover
 - Latency from inside the browser extension (popup) to the host. Adds
  Brave's native-messaging protocol overhead on top of whichever path
  the host takes — but the relative difference between paths is
  preserved.
 - Memory cost of the in-process cache load. The fast path loads
  cache.json once per call today (no caching across calls inside the
  host). A future optimization is to memoize `_load_host_cache` with
  mtime-based invalidation; left for follow-up if needed.
 - Cold-cache effects. `cache.json` is large enough that the OS page
  cache matters; numbers above reflect a warm read. First call after
  a reboot may be slower for both paths but proportionally so.
@@ -0,0 +1,151 @@
 """Measure host fast-path vs subprocess rc-jav.py for cached single-ID search.
 Step 11 of the console-consolidation roadmap asks: does the host's
 `handle_cached_search_fast` actually save meaningful latency vs just
 shelling out to `rc-jav.py --search ID --format json --quick`? If yes,
 under what conditions (idle Python vs Python actively scanning)?
 This script runs both paths N times against a set of query IDs and
 reports min / median / mean / p95 / max in milliseconds.
 Usage:
    python benchmarks/host-fast-path.py [--queries Q1 Q2 ...] [--iterations N]
 To measure (b) Python-actively-scanning, kick off a `rc-jav.py --scan` in
 another terminal, then run this script while the scan runs.
 The fast-path implementation is replicated inline here (not imported
 from the host module) so the benchmark is self-contained.
 """
 from __future__ import annotations
 import argparse
 import json
 import statistics
 import subprocess
 import sys
 import time
 from pathlib import Path
 ROOT = Path(__file__).resolve().parents[1]
 if str(ROOT) not in sys.path:
    sys.path.insert(0, str(ROOT))
 from rcjav.cache import load_cache  # noqa: E402
 from rcjav.ids import current_rules_signature, normalize_id  # noqa: E402
 DEFAULT_QUERIES = ["SSIS-001", "ABP-100", "FC2-1841460", "MIDD-500", "IBW-902"]
 DEFAULT_ITERATIONS = 20
 def fast_path_search(cache: dict, query: str) -> int:
    """Replicates handle_cached_search_fast minus the response shape.
    Returns hit count. Walks every remote's files[] looking for jav_id
    matching the normalized query (exact or `<id>#partN`).
    """
    norm = normalize_id(query)
    if not norm:
        return 0
    hits = 0
    for remote, entry in (cache.get("remotes") or {}).items():
        files = entry.get("files") or []
        for item in files:
            jid = item.get("jav_id", "")
            if jid == norm or (isinstance(jid, str) and jid.startswith(norm + "#part")):
                hits += 1
    return hits
 def time_fast_path(query: str, iterations: int) -> list[float]:
    sig = current_rules_signature()
    cache = load_cache(sig)
    out: list[float] = []
    for _ in range(iterations):
        t0 = time.perf_counter()
        fast_path_search(cache, query)
        out.append((time.perf_counter() - t0) * 1000)
    return out
 def time_subprocess(query: str, iterations: int) -> list[float]:
    cmd = [
        sys.executable,
        str(ROOT / "rc-jav.py"),
        "--search", query,
        "--cache",        # force cache mode (no rclone)
        "--format", "json",
        "--basic", "--no-color",
    ]
    out: list[float] = []
    for _ in range(iterations):
        t0 = time.perf_counter()
        proc = subprocess.run(cmd, capture_output=True, text=True, encoding="utf-8", errors="replace")
        out.append((time.perf_counter() - t0) * 1000)
        if proc.returncode not in (0, 1):  # 1 = no hits, still valid
            sys.stderr.write(f"subprocess returned {proc.returncode}; stderr={proc.stderr[:200]!r}\n")
    return out
 def percentile(values: list[float], p: float) -> float:
    if not values:
        return 0.0
    s = sorted(values)
    k = (len(s) - 1) * p
    f = int(k)
    c = min(f + 1, len(s) - 1)
    return s[f] + (s[c] - s[f]) * (k - f)
 def summarize(label: str, values: list[float]) -> None:
    if not values:
        print(f"  {label}: (no data)")
        return
    print(f"  {label}:")
    print(f"    n={len(values)}  min={min(values):.2f}ms  median={statistics.median(values):.2f}ms  "
          f"mean={statistics.mean(values):.2f}ms  p95={percentile(values, 0.95):.2f}ms  max={max(values):.2f}ms")
 def main() -> int:
    ap = argparse.ArgumentParser(description=__doc__)
    ap.add_argument("--queries", nargs="+", default=DEFAULT_QUERIES,
                    help=f"JAV IDs to search (default: {DEFAULT_QUERIES})")
    ap.add_argument("--iterations", type=int, default=DEFAULT_ITERATIONS,
                    help=f"Iterations per query per path (default: {DEFAULT_ITERATIONS})")
    args = ap.parse_args()
    print(f"Host fast-path vs subprocess rc-jav.py benchmark")
    print(f"queries: {args.queries}")
    print(f"iterations per path: {args.iterations}")
    print(f"cache: {ROOT / 'cache.json'}")
    print()
    all_fast: list[float] = []
    all_sub: list[float] = []
    for q in args.queries:
        print(f"[{q}]")
        fast = time_fast_path(q, args.iterations)
        summarize("fast-path (in-process dict walk)", fast)
        sub = time_subprocess(q, args.iterations)
        summarize("subprocess rc-jav.py --search --cache", sub)
        all_fast.extend(fast)
        all_sub.extend(sub)
        if fast and sub:
            speedup = statistics.median(sub) / max(statistics.median(fast), 0.001)
            print(f"  speedup (median sub / median fast): {speedup:.1f}x")
        print()
    print("=== aggregate ===")
    summarize("fast-path total", all_fast)
    summarize("subprocess total", all_sub)
    if all_fast and all_sub:
        med_speedup = statistics.median(all_sub) / max(statistics.median(all_fast), 0.001)
        p95_speedup = percentile(all_sub, 0.95) / max(percentile(all_fast, 0.95), 0.001)
        print(f"  median speedup: {med_speedup:.1f}x")
        print(f"  p95 speedup:    {p95_speedup:.1f}x")
    return 0
 if __name__ == "__main__":
    sys.exit(main())