Files
mcma-backend/app/workers/tasks/import_task.py
T
Senko-san c72d19599a
Docker Build & Publish / push (push) Has been cancelled
Docker Build & Publish / Prune old image versions (push) Has been cancelled
Docker Build & Publish / build (push) Failing after 10m8s
feat(enrichment): tag-first metadata pipeline (§1D)
Implements the §6.2 enrichment pipeline: embedded tags → Chromaprint
fingerprint → AcoustID lookup. Well-tagged files get correct
artist/album/title offline; the rest are identified via AcoustID
(which also yields a MusicBrainz recording id in one call).

- domain: AudioTags/Fingerprint/RecordingMatch value objects; ports
  AudioTagReader, AudioFingerprinter, AcoustIdClient; TrackRepository
  .apply_enrichment (gap-fill, never erases) + AlbumRepository.get_or_create
- infrastructure/metadata: MutagenTagReader, FpcalcFingerprinter,
  AcoustIdHttpClient (rich meta=recordings+releasegroups, throttled)
- application: MetadataEnrichmentService — tags preferred, AcoustID fills
  gaps; resolves artist/album; status enriched/failed; skips manual;
  every external step wrapped (graceful degradation)
- workers: enrich_task registered; enqueue_enrich is best-effort and
  deferred so the caller's txn commits before the worker reads the row
- wiring: upload enqueues after add; import returns imported_ids and
  enqueues post-commit (mid-scan would race the worker); manual
  POST /tracks/{id}/metadata/enrich endpoint
- deps: add mutagen (fpcalc/ffmpeg already in the image)

Tests: metadata service orchestration, AcoustID parser, tag helpers.
125 passed; mypy strict + ruff clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 13:04:02 +03:00

53 lines
1.8 KiB
Python

"""arq task: scan an indexable source and import its files into the library.
Heavy work (directory walk + file copies) belongs off the request cycle
(CLAUDE.md). The HTTP endpoint enqueues this; the worker runs it with its own
transactional session.
"""
import uuid
from typing import Any
from app.application.import_service import LibraryImportService
from app.core.config import get_settings
from app.core.logging import get_logger
from app.infrastructure.db import session_scope
from app.infrastructure.db.repositories import (
SqlAlchemyArtistRepository,
SqlAlchemyTrackRepository,
)
from app.infrastructure.sources.registry import build_source_registry
from app.infrastructure.storage.provider import get_file_storage
from app.workers.queue import enqueue_enrich
log = get_logger("worker.import")
async def scan_local_folder(
_ctx: dict[str, Any], *, source: str = "local", added_by: str | None = None
) -> dict[str, Any]:
registry = build_source_registry(get_settings())
backend = registry.indexable(source)
actor = uuid.UUID(added_by) if added_by else None
async with session_scope() as session:
service = LibraryImportService(
tracks=SqlAlchemyTrackRepository(session),
artists=SqlAlchemyArtistRepository(session),
storage=get_file_storage(),
)
summary = await service.scan_and_import(backend, added_by=actor)
# Enqueue enrichment only after the import transaction has committed above,
# so the enrich worker is guaranteed to see the new rows.
for track_id in summary.imported_ids:
await enqueue_enrich(track_id)
return {
"source": summary.source,
"seen": summary.seen,
"imported": summary.imported,
"skipped": summary.skipped,
"failed": summary.failed,
}