Step 10a + 10b: scaffold rcjav/ package, extract ID rules

Carves the first slice out of the monolithic rc-jav.py (now 2017
lines, was 2230). Two new modules:

  rcjav/model.py    FileEntry dataclass — the one shared shape that
                    every other submodule will need.
  rcjav/ids.py      Single source of truth for everything that
                    influences a FileEntry.jav_id: PRIMARY_ID_RE,
                    FALLBACK_ID_RE, COMPOUND_ID_RE, BUILTIN_PART_RES,
                    configure_part_patterns, detect_part,
                    detect_part_from_stem, part_key, extract_id,
                    normalize_id, describe_id_match, expand_range,
                    plus the supporting "private" regexes
                    (_BRACKET_ID_RE, _RESOLUTION_TAG_RE, etc.) that
                    other code in rc-jav.py still reads.

rcjav/__init__.py re-exports the public surface so future external
consumers can `from rcjav import extract_id` without caring which
submodule it lives in.

rc-jav.py drops the inline ID block and pulls everything from
rcjav.ids via a single import statement. PART_RES is intentionally
NOT imported — it's mutated by configure_part_patterns at runtime, so
a captured top-level reference would go stale. A small helper
`_current_part_res()` reads it dynamically via `_rcjav_ids.PART_RES`.

fixtures/run.py fix: synthesized importlib module name changed from
"rcjav" (which now collides with the real package directory) to
"rcjav_script". Also prepends ROOT to sys.path so rc-jav.py's
`from rcjav.model import …` resolves when run as
`python fixtures/run.py`.

Verified:
  - python rc-jav.py --help              → usage banner prints
  - python fixtures/run.py               → 17/17 cases pass
  - python -m unittest tests.test_rules  → 5/5 OK

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
admin
2026-05-22 21:43:57 +02:00
parent e029e898e9
commit ba57b7fd21
5 changed files with 327 additions and 240 deletions
+24
View File
@@ -0,0 +1,24 @@
"""Shared data shapes used by multiple submodules.
Kept tiny on purpose — only types whose definition is depended on
across module boundaries belong here. Behavior (find_dupes, decide_keep,
extract_id, etc.) lives in the module that owns it.
"""
from __future__ import annotations
from dataclasses import dataclass
@dataclass
class FileEntry:
source: str # "Source" (priority) or "Target"
remote: str # the rclone remote:path root supplied
path: str # relative path within remote
size: int
mod_time: str
jav_id: str # normalized, e.g. "SSIS-1"
@property
def full_path(self) -> str:
sep = "" if self.remote.endswith("/") or not self.path else "/"
return f"{self.remote}{sep}{self.path}"