c72d19599a
Implements the §6.2 enrichment pipeline: embedded tags → Chromaprint
fingerprint → AcoustID lookup. Well-tagged files get correct
artist/album/title offline; the rest are identified via AcoustID
(which also yields a MusicBrainz recording id in one call).
- domain: AudioTags/Fingerprint/RecordingMatch value objects; ports
AudioTagReader, AudioFingerprinter, AcoustIdClient; TrackRepository
.apply_enrichment (gap-fill, never erases) + AlbumRepository.get_or_create
- infrastructure/metadata: MutagenTagReader, FpcalcFingerprinter,
AcoustIdHttpClient (rich meta=recordings+releasegroups, throttled)
- application: MetadataEnrichmentService — tags preferred, AcoustID fills
gaps; resolves artist/album; status enriched/failed; skips manual;
every external step wrapped (graceful degradation)
- workers: enrich_task registered; enqueue_enrich is best-effort and
deferred so the caller's txn commits before the worker reads the row
- wiring: upload enqueues after add; import returns imported_ids and
enqueues post-commit (mid-scan would race the worker); manual
POST /tracks/{id}/metadata/enrich endpoint
- deps: add mutagen (fpcalc/ffmpeg already in the image)
Tests: metadata service orchestration, AcoustID parser, tag helpers.
125 passed; mypy strict + ruff clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
54 lines
1.6 KiB
Python
54 lines
1.6 KiB
Python
"""Value objects for the metadata-enrichment pipeline (plan §6.2).
|
|
|
|
Pure data carriers between the enrichment service and its adapters (tag reader,
|
|
fingerprinter, AcoustID). No framework imports — these cross the domain boundary.
|
|
"""
|
|
|
|
from dataclasses import dataclass
|
|
|
|
|
|
@dataclass(frozen=True, slots=True)
|
|
class AudioTags:
|
|
"""Embedded tags read from the file itself (ID3 / Vorbis / MP4 …).
|
|
|
|
Every field is optional — files are tagged inconsistently. The reader fills
|
|
what it can and leaves the rest ``None`` for downstream identification.
|
|
"""
|
|
|
|
title: str | None = None
|
|
artist: str | None = None
|
|
album: str | None = None
|
|
album_artist: str | None = None
|
|
genre: str | None = None
|
|
year: int | None = None
|
|
track_number: int | None = None
|
|
duration_seconds: int | None = None
|
|
bitrate: int | None = None
|
|
|
|
|
|
@dataclass(frozen=True, slots=True)
|
|
class Fingerprint:
|
|
"""Chromaprint fingerprint plus the decoded duration (both needed by AcoustID)."""
|
|
|
|
fingerprint: str
|
|
duration_seconds: int
|
|
|
|
|
|
@dataclass(frozen=True, slots=True)
|
|
class RecordingMatch:
|
|
"""A single AcoustID result, flattened to the fields enrichment cares about.
|
|
|
|
``acoustid`` is the stable AcoustID identifier (a UUID) — used as the
|
|
dedup key persisted on ``track.acoustid_fingerprint`` (fits the 64-char
|
|
column; the raw fingerprint does not). ``recording_mbid`` is the MusicBrainz
|
|
recording id when present.
|
|
"""
|
|
|
|
acoustid: str
|
|
score: float
|
|
recording_mbid: str | None = None
|
|
title: str | None = None
|
|
artist: str | None = None
|
|
album: str | None = None
|
|
year: int | None = None
|