Step 10d: extract dupes/keep-ranking into rcjav/dupes.py

Pulls the duplicate-detection and keep-ranking surface out of
rc-jav.py:

  DEFAULT_KEEP_RANKING
  _KEEP_RANKING (module global)
  decide_keep_with_reason
  decide_keep
  find_dupes
  _SUSPICIOUS_MULTIPART_TAIL_RE
  describe_dupe_risks
  find_variant_alerts

Same mutable-rebound pattern as PART_RES: `_KEEP_RANKING` is now
configured via `set_keep_ranking(dict)` rather than a `global` write
in rc-jav.py's main(). Reads happen only inside the module that owns
the binding, so callers never see a stale snapshot.

rc-jav.py: 1972 → 1763 lines (209 extracted).
rcjav/dupes.py: 244 lines.

Verified:
  - python rc-jav.py --help              → ok
  - python fixtures/run.py               → 17/17 cases pass
  - python -m unittest tests.test_rules  → 5/5 OK

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
admin
2026-05-22 21:49:14 +02:00
parent f03d032336
commit 8d636ec633
3 changed files with 284 additions and 220 deletions
+10
View File
@@ -6,6 +6,16 @@ find at the top level. Adding a new submodule does not change the
public surface — only this file does.
"""
from rcjav.model import FileEntry # noqa: F401
from rcjav.dupes import ( # noqa: F401
DEFAULT_KEEP_RANKING,
set_keep_ranking,
get_keep_ranking,
decide_keep_with_reason,
decide_keep,
find_dupes,
describe_dupe_risks,
find_variant_alerts,
)
from rcjav.cache import ( # noqa: F401
CACHE_PATH,
CACHE_VERSION,