feat(enrichment): record status/errors and trust high-confidence AcoustID
Docker Build & Publish / build (push) Has been cancelled
Docker Build & Publish / push (push) Has been cancelled
Docker Build & Publish / Prune old image versions (push) Has been cancelled

Two related gaps surfaced from "uploaded a track, nothing changed / no status":

- A track could stay stuck on `pending` forever (an unexpected worker error
  rolled back the run without recording anything), and `failed` carried no
  reason. Add `tracks.metadata_error` + `tracks.enriched_at` (migration), stamp
  the outcome in apply_enrichment, add TrackRepository.mark_enrichment_failed,
  wrap enrich_task to persist crashes as `failed` in a fresh session, and emit a
  human-readable no-match reason. Expose metadata_error/enriched_at in TrackOut.

- The tag-first merge let junk embedded tags (e.g. "Music Track"/"Sound_13958")
  override even a 0.99-confidence AcoustID match. Add acoustid_trust_score
  (default 0.85): above it the acoustic identity wins for title/artist/album/
  year, tags are fallback; below it, tag-first as before.

Add a license-free real-file fixture (Scarlet Fire / Otis McDonald) whose junk
tags AcoustID overrides, with an always-on tag-reader test plus fpcalc/AcoustID/
network-gated identity + full-pipeline tests (skip on host, run in the container).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Senko-san
2026-06-13 13:29:08 +03:00
parent 30cb8901f2
commit 73d7da440f
17 changed files with 468 additions and 33 deletions
+12 -1
View File
@@ -6,9 +6,10 @@
imports/downloads stay idempotent (plan §4, §6.1).
"""
import datetime as dt
import uuid
from sqlalchemy import ForeignKey, Integer, String, UniqueConstraint
from sqlalchemy import DateTime, ForeignKey, Integer, String, UniqueConstraint
from sqlalchemy.orm import Mapped, mapped_column
from app.infrastructure.db.base import Base
@@ -63,6 +64,16 @@ class TrackModel(UUIDPrimaryKeyMixin, TimestampMixin, Base):
nullable=False,
default=MetadataStatus.PENDING.value,
)
# Human-readable reason the last enrichment run set ``failed`` (no match, or
# an unexpected worker error). ``None`` once a run succeeds. Surfaced in the
# UI so a stuck/failed track is diagnosable, not silent.
metadata_error: Mapped[str | None] = mapped_column(String(2048), nullable=True)
# When the last enrichment run finished (success or failure). ``None`` while
# still ``pending`` — lets the UI distinguish "queued/running" from "done".
enriched_at: Mapped[dt.datetime | None] = mapped_column(
DateTime(timezone=True),
nullable=True,
)
added_by: Mapped[uuid.UUID | None] = mapped_column(
ForeignKey("users.id", ondelete="SET NULL"),
@@ -39,6 +39,8 @@ def _track_to_entity(row: TrackModel) -> Track:
genre=row.genre,
year=row.year,
metadata_status=row.metadata_status,
metadata_error=row.metadata_error,
enriched_at=row.enriched_at,
created_at=row.created_at,
updated_at=row.updated_at,
)
@@ -38,6 +38,8 @@ def _track_to_entity(row: TrackModel) -> Track:
genre=row.genre,
year=row.year,
metadata_status=row.metadata_status,
metadata_error=row.metadata_error,
enriched_at=row.enriched_at,
created_at=row.created_at,
updated_at=row.updated_at,
)
@@ -1,5 +1,6 @@
"""Track repository — adapter over ``AsyncSession``."""
import datetime as dt
import uuid
from sqlalchemy import func, select
@@ -26,6 +27,8 @@ def _to_entity(row: TrackModel) -> Track:
genre=row.genre,
year=row.year,
metadata_status=row.metadata_status,
metadata_error=row.metadata_error,
enriched_at=row.enriched_at,
created_at=row.created_at,
updated_at=row.updated_at,
)
@@ -189,6 +192,7 @@ class SqlAlchemyTrackRepository:
acoustid_fingerprint: str | None,
musicbrainz_id: str | None,
metadata_status: str,
metadata_error: str | None = None,
) -> Track:
row = await self._session.get(TrackModel, track_id)
if row is None:
@@ -197,6 +201,10 @@ class SqlAlchemyTrackRepository:
row.title = title
row.artist_id = artist_id
row.metadata_status = metadata_status
# A finished run always stamps outcome: clear/set the reason and mark the
# completion time so the UI can tell "still pending" from "done/failed".
row.metadata_error = metadata_error
row.enriched_at = dt.datetime.now(dt.UTC)
# Nullable extras: fill gaps only — never erase data a prior run found.
if album_id is not None:
row.album_id = album_id
@@ -217,3 +225,16 @@ class SqlAlchemyTrackRepository:
await self._session.flush()
await self._session.refresh(row)
return _to_entity(row)
async def mark_enrichment_failed(self, track_id: uuid.UUID, *, error: str) -> None:
"""Record that an enrichment run crashed (unexpected exception). Runs in
its own session so the failure is persisted even though the run's own
transaction rolled back. Never overwrites ``manual`` (a no-op then), and
a missing track is a clean no-op."""
row = await self._session.get(TrackModel, track_id)
if row is None or row.metadata_status == "manual":
return
row.metadata_status = "failed"
row.metadata_error = error
row.enriched_at = dt.datetime.now(dt.UTC)
await self._session.flush()