Files
mcma-backend/app/workers/tasks/enrich_task.py
T
Senko-san 73d7da440f
Docker Build & Publish / build (push) Has been cancelled
Docker Build & Publish / push (push) Has been cancelled
Docker Build & Publish / Prune old image versions (push) Has been cancelled
feat(enrichment): record status/errors and trust high-confidence AcoustID
Two related gaps surfaced from "uploaded a track, nothing changed / no status":

- A track could stay stuck on `pending` forever (an unexpected worker error
  rolled back the run without recording anything), and `failed` carried no
  reason. Add `tracks.metadata_error` + `tracks.enriched_at` (migration), stamp
  the outcome in apply_enrichment, add TrackRepository.mark_enrichment_failed,
  wrap enrich_task to persist crashes as `failed` in a fresh session, and emit a
  human-readable no-match reason. Expose metadata_error/enriched_at in TrackOut.

- The tag-first merge let junk embedded tags (e.g. "Music Track"/"Sound_13958")
  override even a 0.99-confidence AcoustID match. Add acoustid_trust_score
  (default 0.85): above it the acoustic identity wins for title/artist/album/
  year, tags are fallback; below it, tag-first as before.

Add a license-free real-file fixture (Scarlet Fire / Otis McDonald) whose junk
tags AcoustID overrides, with an always-on tag-reader test plus fpcalc/AcoustID/
network-gated identity + full-pipeline tests (skip on host, run in the container).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 13:29:08 +03:00

77 lines
3.2 KiB
Python

"""arq task: enrich one track's metadata (plan §6.2, §1D).
Wires the §6.2 pipeline adapters to :class:`MetadataEnrichmentService` and runs
it in the worker's own transactional session. Enqueued (deferred) after upload
and after a local-folder import. Idempotent and best-effort — a missing track or
a ``manual`` one is a clean no-op.
"""
import uuid
from typing import Any
from app.application.metadata_service import MetadataEnrichmentService
from app.core.config import get_settings
from app.core.logging import get_logger
from app.infrastructure.db import session_scope
from app.infrastructure.db.repositories import (
SqlAlchemyAlbumRepository,
SqlAlchemyArtistRepository,
SqlAlchemyTrackRepository,
)
from app.infrastructure.metadata.acoustid import AcoustIdHttpClient
from app.infrastructure.metadata.cover_extractor import MutagenCoverExtractor
from app.infrastructure.metadata.coverart import CoverArtArchiveClient
from app.infrastructure.metadata.fingerprint import FpcalcFingerprinter
from app.infrastructure.metadata.tags import MutagenTagReader
from app.infrastructure.storage.provider import get_file_storage
log = get_logger("worker.enrich")
async def enrich_track(_ctx: dict[str, Any], *, track_id: str) -> dict[str, Any]:
settings = get_settings()
api_key = settings.acoustid_api_key.get_secret_value() if settings.acoustid_api_key else None
acoustid = AcoustIdHttpClient(
api_key=api_key,
user_agent=settings.musicbrainz_user_agent,
api_url=settings.acoustid_api_url,
)
cover_provider = CoverArtArchiveClient(
user_agent=settings.musicbrainz_user_agent,
enabled=settings.coverart_enabled,
base_url=settings.coverart_base_url,
)
tid = uuid.UUID(track_id)
try:
async with session_scope() as session:
service = MetadataEnrichmentService(
tracks=SqlAlchemyTrackRepository(session),
artists=SqlAlchemyArtistRepository(session),
albums=SqlAlchemyAlbumRepository(session),
storage=get_file_storage(),
tag_reader=MutagenTagReader(),
fingerprinter=FpcalcFingerprinter(settings.fpcalc_path),
acoustid=acoustid,
cover_extractor=MutagenCoverExtractor(),
cover_provider=cover_provider,
acoustid_trust_score=settings.acoustid_trust_score,
)
result = await service.enrich(tid)
except Exception as exc:
# The run's own transaction rolled back, leaving the track stuck at
# ``pending``. Record the failure in a fresh session so the UI shows a
# ``failed`` status with a reason instead of a silent, endless spinner.
log.exception("enrich_failed", track_id=track_id)
async with session_scope() as session:
await SqlAlchemyTrackRepository(session).mark_enrichment_failed(
tid, error=f"Enrichment crashed: {type(exc).__name__}: {exc}"
)
return {"track_id": track_id, "status": "failed", "mbid": None}
return {
"track_id": str(result.track_id),
"status": result.status,
"mbid": result.matched_mbid,
}