feat(enrichment): tag-first metadata pipeline (§1D)
Implements the §6.2 enrichment pipeline: embedded tags → Chromaprint
fingerprint → AcoustID lookup. Well-tagged files get correct
artist/album/title offline; the rest are identified via AcoustID
(which also yields a MusicBrainz recording id in one call).
- domain: AudioTags/Fingerprint/RecordingMatch value objects; ports
AudioTagReader, AudioFingerprinter, AcoustIdClient; TrackRepository
.apply_enrichment (gap-fill, never erases) + AlbumRepository.get_or_create
- infrastructure/metadata: MutagenTagReader, FpcalcFingerprinter,
AcoustIdHttpClient (rich meta=recordings+releasegroups, throttled)
- application: MetadataEnrichmentService — tags preferred, AcoustID fills
gaps; resolves artist/album; status enriched/failed; skips manual;
every external step wrapped (graceful degradation)
- workers: enrich_task registered; enqueue_enrich is best-effort and
deferred so the caller's txn commits before the worker reads the row
- wiring: upload enqueues after add; import returns imported_ids and
enqueues post-commit (mid-scan would race the worker); manual
POST /tracks/{id}/metadata/enrich endpoint
- deps: add mutagen (fpcalc/ffmpeg already in the image)
Tests: metadata service orchestration, AcoustID parser, tag helpers.
125 passed; mypy strict + ruff clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -173,3 +173,47 @@ class SqlAlchemyTrackRepository:
|
||||
await self._session.flush()
|
||||
await self._session.refresh(row)
|
||||
return _to_entity(row)
|
||||
|
||||
async def apply_enrichment(
|
||||
self,
|
||||
track_id: uuid.UUID,
|
||||
*,
|
||||
title: str,
|
||||
artist_id: uuid.UUID,
|
||||
album_id: uuid.UUID | None,
|
||||
genre: str | None,
|
||||
year: int | None,
|
||||
track_number: int | None,
|
||||
duration_seconds: int | None,
|
||||
bitrate: int | None,
|
||||
acoustid_fingerprint: str | None,
|
||||
musicbrainz_id: str | None,
|
||||
metadata_status: str,
|
||||
) -> Track:
|
||||
row = await self._session.get(TrackModel, track_id)
|
||||
if row is None:
|
||||
raise NotFoundError(f"Track {track_id} not found.")
|
||||
# Identity + status are authoritative for an enrichment run.
|
||||
row.title = title
|
||||
row.artist_id = artist_id
|
||||
row.metadata_status = metadata_status
|
||||
# Nullable extras: fill gaps only — never erase data a prior run found.
|
||||
if album_id is not None:
|
||||
row.album_id = album_id
|
||||
if genre is not None:
|
||||
row.genre = genre
|
||||
if year is not None:
|
||||
row.year = year
|
||||
if track_number is not None:
|
||||
row.track_number = track_number
|
||||
if duration_seconds is not None:
|
||||
row.duration_seconds = duration_seconds
|
||||
if bitrate is not None:
|
||||
row.bitrate = bitrate
|
||||
if acoustid_fingerprint is not None:
|
||||
row.acoustid_fingerprint = acoustid_fingerprint
|
||||
if musicbrainz_id is not None:
|
||||
row.musicbrainz_id = musicbrainz_id
|
||||
await self._session.flush()
|
||||
await self._session.refresh(row)
|
||||
return _to_entity(row)
|
||||
|
||||
Reference in New Issue
Block a user