Compare commits

...

11 Commits

Author SHA1 Message Date
Senko-san e45e578f54 feat(library): remote browse status + save/materialize API (§Phase2-3)
Docker Build & Publish / build (push) Successful in 1m11s
Docker Build & Publish / push (push) Failing after 6s
Docker Build & Publish / Prune old image versions (push) Has been skipped
Search results now report whether a hit is already saved (in_library,
track_id, availability). New RemoteLibraryService backs POST
/tracks/remote (idempotent placeholder save) and POST
/tracks/{id}/materialize (on-demand fetch via a new materialize_track
arq task, reusing in-flight jobs).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-14 18:11:01 +03:00
Senko-san 58b98ab5ed feat(library): lazy materialization foundation for remote tracks (§Phase1)
Docker Build & Publish / build (push) Successful in 1m10s
Docker Build & Publish / push (push) Failing after 7s
Docker Build & Publish / Prune old image versions (push) Has been skipped
Adds nullable storage fields + availability column on tracks, remote
source/source_id identity on albums/artists, TrackRepository.materialize()
and get_or_create_remote() repos — groundwork for on-demand YTM library
(placeholders saved without audio, materialized in-place on first play).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-14 17:51:43 +03:00
Senko-san 78007461e1 feat(sources): YouTube Music search + download pipeline (§1C/§1E)
Docker Build & Publish / build (push) Successful in 2m39s
Docker Build & Publish / push (push) Failing after 36s
Docker Build & Publish / Prune old image versions (push) Has been skipped
Pluggable fetch source: ytmusicapi search + yt-dlp download (cookies-file guard), DownloadJob entity/repo + DownloadService, download_task worker with exponential-backoff retries, and wired /search, /sources/{source}/search, and /downloads endpoints. Adds youtube_enabled/cookies config, yt-dlp+ytmusicapi deps, and the download_jobs.track_id migration. Snapshot also bundles in-progress storage/tracks/acoustid edits.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 14:04:33 +03:00
Senko-san ea880edd57 feat(tracks): filter track list by ingest source
Docker Build & Publish / build (push) Has been cancelled
Docker Build & Publish / push (push) Has been cancelled
Docker Build & Publish / Prune old image versions (push) Has been cancelled
Add an optional `source` filter to `GET /api/v1/tracks` (and the
`TrackRepository.list`/`count` port + SQLAlchemy adapter). Lets clients
query, e.g., only uploaded tracks (`?source=upload`) newest-first — the
backing for the webui's persistent "Recently uploaded" view.

- test: upload then list with `?source=upload` (hit) / `?source=youtube`
  (miss)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 01:35:51 +03:00
Senko-san fa23568214 feat(storage): library + disk statistics endpoint (§A6)
Docker Build & Publish / build (push) Has been cancelled
Docker Build & Publish / push (push) Has been cancelled
Docker Build & Publish / Prune old image versions (push) Has been cancelled
Implement `GET /api/v1/storage`, replacing the stub. Returns aggregate
library facts (track/artist/album counts, total footprint, playtime,
per-format / per-source / metadata-status breakdowns, top genres) plus
the real capacity of the backing volume.

- domain: `LibraryStats`, `FormatBreakdown`, `DiskUsage` value objects
- ports: `FileStorage.disk_usage()` (local = shutil.disk_usage walking up
  to the nearest existing ancestor; S3 returns None — no fixed disk)
- repo: `TrackRepository.library_stats()` (single set of GROUP BYs)
- tests: storage stats API (auth, empty library, upload counting)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 01:19:53 +03:00
Senko-san 636820afb8 fix: invalid Python 2 except syntax in AcoustID client
Docker Build & Publish / build (push) Has been cancelled
Docker Build & Publish / push (push) Has been cancelled
Docker Build & Publish / Prune old image versions (push) Has been cancelled
The except clause used the Python 2 multi-exception syntax, which is
a SyntaxError under Python 3.14 and broke import of this module.
2026-06-14 01:01:33 +03:00
Senko-san 63c7d05eca feat(metadata): implement single-track metadata editor API (§A7/§1H)
Docker Build & Publish / Prune old image versions (push) Has been cancelled
Docker Build & Publish / build (push) Has been cancelled
Docker Build & Publish / push (push) Has been cancelled
Adds inline AcoustID match-finding (multiple ranked candidates via
lookup_all) and PUT /tracks/{id}/metadata for manual edits, resolving
artist/album and setting metadata_status=manual. Extends TrackOut with
genre/year/track_number.
2026-06-13 14:34:43 +03:00
Senko-san 73d7da440f feat(enrichment): record status/errors and trust high-confidence AcoustID
Docker Build & Publish / build (push) Has been cancelled
Docker Build & Publish / push (push) Has been cancelled
Docker Build & Publish / Prune old image versions (push) Has been cancelled
Two related gaps surfaced from "uploaded a track, nothing changed / no status":

- A track could stay stuck on `pending` forever (an unexpected worker error
  rolled back the run without recording anything), and `failed` carried no
  reason. Add `tracks.metadata_error` + `tracks.enriched_at` (migration), stamp
  the outcome in apply_enrichment, add TrackRepository.mark_enrichment_failed,
  wrap enrich_task to persist crashes as `failed` in a fresh session, and emit a
  human-readable no-match reason. Expose metadata_error/enriched_at in TrackOut.

- The tag-first merge let junk embedded tags (e.g. "Music Track"/"Sound_13958")
  override even a 0.99-confidence AcoustID match. Add acoustid_trust_score
  (default 0.85): above it the acoustic identity wins for title/artist/album/
  year, tags are fallback; below it, tag-first as before.

Add a license-free real-file fixture (Scarlet Fire / Otis McDonald) whose junk
tags AcoustID overrides, with an always-on tag-reader test plus fpcalc/AcoustID/
network-gated identity + full-pipeline tests (skip on host, run in the container).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 13:29:08 +03:00
Senko-san 30cb8901f2 fix(tests): isolate suite to a dedicated *_test database
Integration fixtures call Base.metadata.drop_all/create_all on get_engine(),
whose DATABASE_URL points at the developer's real DB — localhost:5432/mcma for
host pytest, db:5432/mcma for `make test-api` (pytest runs inside the api
container). Every run silently wiped dev data: drop_all removes ORM tables but
leaves alembic_version (outside Base.metadata), the exact "tables keep
disappearing, version survives" symptom.

conftest now redirects the whole suite to a <db>_test database before settings
load and creates it on demand via asyncpg, so the dev DB is never opened.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 13:27:58 +03:00
Senko-san 0bb752f582 feat: cover-art pipeline (§1D)
Docker Build & Publish / build (push) Has been cancelled
Docker Build & Publish / push (push) Has been cancelled
Docker Build & Publish / Prune old image versions (push) Has been cancelled
Resolve, store and serve album cover art.

Sources (tag-first, mirroring enrichment): embedded artwork extracted
offline via mutagen (ID3 APIC / FLAC+OGG Picture / MP4 covr), then Cover
Art Archive by release-group MBID as a network fallback. Resolution runs
inside MetadataEnrichmentService after album resolution, only when the
album has no cover yet (idempotent, never overwrites), and is best-effort
so a cover failure never affects enrichment status.

- CoverArt value object + CoverArtExtractor/CoverArtProvider ports
- MutagenCoverExtractor + CoverArtArchiveClient adapters
- AcoustID parser now captures release_group_mbid
- Covers stored via FileStorage at covers/{album_id}.{ext} (local + S3)
- AlbumRepository.set_cover_path
- Serve real covers: GET /api/v1/albums|tracks/{id}/cover (StreamUser,
  ?token=), Subsonic getCoverArt (placeholder fallback)
- has_cover flag on AlbumOut/TrackOut
- coverart_enabled / coverart_base_url settings
- tests: cover resolution units + release_group parse + DB-backed
  test_cover_api.py (139 green via make test-api)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 12:10:05 +03:00
Senko-san c7e078d758 feat(config): derive MusicBrainz/AcoustID User-Agent from app name+version
Docker Build & Publish / build (push) Successful in 1m8s
Docker Build & Publish / push (push) Failing after 6s
Docker Build & Publish / Prune old image versions (push) Has been skipped
Replace the placeholder MUSICBRAINZ_USER_AGENT env var with
MUSICBRAINZ_OWNER_EMAIL. The User-Agent ("MCMA/<version> ( <contact> )")
is now composed from the fixed app name, the installed package version,
and the operator's contact email — falling back to the project URL when
no email is configured. Also use the same version for the FastAPI app.
2026-06-11 00:39:24 +03:00
78 changed files with 6081 additions and 913 deletions
+2 -1
View File
@@ -37,5 +37,6 @@ MAX_PARALLEL_DOWNLOADS=2
# external services (all optional — backend degrades gracefully if unset) # external services (all optional — backend degrades gracefully if unset)
# ML_SERVICE_URL=http://ml:9000 # ML_SERVICE_URL=http://ml:9000
# ACOUSTID_API_KEY= # ACOUSTID_API_KEY=
MUSICBRAINZ_USER_AGENT=mcma-backend/0.1.0 ( https://github.com/your/repo ) # Sent to MusicBrainz/AcoustID as part of the User-Agent (MCMA/<version> ( <email> )).
# MUSICBRAINZ_OWNER_EMAIL=you@example.com
# YOUTUBE_COOKIES_PATH=/data/cookies.txt # YOUTUBE_COOKIES_PATH=/data/cookies.txt
@@ -0,0 +1,39 @@
"""tracks: enrichment outcome (error reason + completion time)
Revision ID: 20260613_enrich_outcome
Revises: 20260608_subsonic_pw
Create Date: 2026-06-13 13:00:00.000000
Adds ``tracks.metadata_error`` and ``tracks.enriched_at`` so a finished
enrichment run records *why* it failed and *when* it completed. Lets the UI
distinguish a still-pending/running track from one that is done or failed, and
surface an actionable reason instead of a silent spinner (plan §6.2).
"""
from __future__ import annotations
from collections.abc import Sequence
import sqlalchemy as sa
from alembic import op
revision: str = "20260613_enrich_outcome"
down_revision: str | None = "20260608_subsonic_pw"
branch_labels: str | Sequence[str] | None = None
depends_on: str | Sequence[str] | None = None
def upgrade() -> None:
op.add_column(
"tracks",
sa.Column("metadata_error", sa.String(length=2048), nullable=True),
)
op.add_column(
"tracks",
sa.Column("enriched_at", sa.DateTime(timezone=True), nullable=True),
)
def downgrade() -> None:
op.drop_column("tracks", "enriched_at")
op.drop_column("tracks", "metadata_error")
@@ -0,0 +1,47 @@
"""download_jobs: link finished job to its imported track
Revision ID: 20260614_dl_track_id
Revises: 20260613_enrich_outcome
Create Date: 2026-06-14 10:00:00.000000
Adds ``download_jobs.track_id`` (nullable FK → ``tracks.id``) so a completed
download can point at the library track it produced — the §A5 download manager
links a "done" job to the track, and re-runs can tell a job already imported
(plan §6.1).
"""
from __future__ import annotations
from collections.abc import Sequence
import sqlalchemy as sa
from alembic import op
revision: str = "20260614_dl_track_id"
down_revision: str | None = "20260613_enrich_outcome"
branch_labels: str | Sequence[str] | None = None
depends_on: str | Sequence[str] | None = None
def upgrade() -> None:
op.add_column(
"download_jobs",
sa.Column("track_id", sa.Uuid(), nullable=True),
)
op.create_foreign_key(
op.f("fk_download_jobs_track_id_tracks"),
"download_jobs",
"tracks",
["track_id"],
["id"],
ondelete="SET NULL",
)
def downgrade() -> None:
op.drop_constraint(
op.f("fk_download_jobs_track_id_tracks"),
"download_jobs",
type_="foreignkey",
)
op.drop_column("download_jobs", "track_id")
@@ -0,0 +1,65 @@
"""remote placeholders: track availability, album/artist remote ids
Revision ID: dc126696f5a6
Revises: 20260614_dl_track_id
Create Date: 2026-06-14 11:25:30.643588
"""
from __future__ import annotations
from collections.abc import Sequence
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = 'dc126696f5a6'
down_revision: str | None = '20260614_dl_track_id'
branch_labels: str | Sequence[str] | None = None
depends_on: str | Sequence[str] | None = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.add_column('albums', sa.Column('source', sa.String(length=32), nullable=True))
op.add_column('albums', sa.Column('source_id', sa.String(length=512), nullable=True))
op.create_unique_constraint('uq_albums_source_source_id', 'albums', ['source', 'source_id'])
op.add_column('artists', sa.Column('source', sa.String(length=32), nullable=True))
op.add_column('artists', sa.Column('source_id', sa.String(length=512), nullable=True))
op.create_unique_constraint('uq_artists_source_source_id', 'artists', ['source', 'source_id'])
op.add_column(
'tracks',
sa.Column('availability', sa.String(length=16), nullable=False, server_default='local'),
)
op.alter_column('tracks', 'availability', server_default=None)
op.alter_column('tracks', 'storage_uri',
existing_type=sa.VARCHAR(length=2048),
nullable=True)
op.alter_column('tracks', 'file_format',
existing_type=sa.VARCHAR(length=32),
nullable=True)
op.alter_column('tracks', 'file_size',
existing_type=sa.INTEGER(),
nullable=True)
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.alter_column('tracks', 'file_size',
existing_type=sa.INTEGER(),
nullable=False)
op.alter_column('tracks', 'file_format',
existing_type=sa.VARCHAR(length=32),
nullable=False)
op.alter_column('tracks', 'storage_uri',
existing_type=sa.VARCHAR(length=2048),
nullable=False)
op.drop_column('tracks', 'availability')
op.drop_constraint('uq_artists_source_source_id', 'artists', type_='unique')
op.drop_column('artists', 'source_id')
op.drop_column('artists', 'source')
op.drop_constraint('uq_albums_source_source_id', 'albums', type_='unique')
op.drop_column('albums', 'source_id')
op.drop_column('albums', 'source')
# ### end Alembic commands ###
+57
View File
@@ -0,0 +1,57 @@
"""Shared cover-art serving helper (presentation).
Streams a stored cover image from the :class:`FileStorage` port. Used by the
native ``/api/v1`` cover endpoints and the Subsonic ``getCoverArt`` adapter so
the streaming/content-type logic lives in one place.
"""
import uuid
from fastapi.responses import StreamingResponse
from app.domain.entities.album import Album
from app.domain.errors import NotFoundError, StorageError
from app.domain.ports import AlbumRepository, FileStorage, TrackRepository
_CONTENT_TYPE_BY_EXT: dict[str, str] = {
"jpg": "image/jpeg",
"jpeg": "image/jpeg",
"png": "image/png",
"webp": "image/webp",
"gif": "image/gif",
}
# Covers are immutable for a given album (a new cover means a new key), so let
# clients cache aggressively.
_CACHE_CONTROL = "public, max-age=86400"
def _content_type_for(key: str) -> str:
ext = key.rsplit(".", 1)[-1].lower() if "." in key else ""
return _CONTENT_TYPE_BY_EXT.get(ext, "application/octet-stream")
async def stream_cover(storage: FileStorage, cover_path: str) -> StreamingResponse:
"""Stream a stored cover by its storage key. Raises ``NotFoundError`` if the
object is missing (a dangling ``cover_path`` reads as "no cover")."""
try:
stream, total = await storage.open_range(cover_path, 0, None)
except StorageError as exc:
raise NotFoundError("Cover not found.") from exc
return StreamingResponse(
stream,
media_type=_content_type_for(cover_path),
headers={"Content-Length": str(total), "Cache-Control": _CACHE_CONTROL},
)
async def resolve_album_for_track(
track_repo: TrackRepository,
album_repo: AlbumRepository,
track_id: uuid.UUID,
) -> Album | None:
"""The album that owns a track (cover lives on the album), or ``None``."""
track = await track_repo.get_by_id(track_id)
if track is None or track.album_id is None:
return None
return await album_repo.get_by_id(track.album_id)
+54 -1
View File
@@ -15,6 +15,9 @@ from fastapi.security import HTTPAuthorizationCredentials, HTTPBearer
from sqlalchemy.ext.asyncio import AsyncSession from sqlalchemy.ext.asyncio import AsyncSession
from app.application.auth_service import AuthService from app.application.auth_service import AuthService
from app.application.download_service import DownloadService
from app.application.metadata_service import MetadataEnrichmentService
from app.application.remote_library_service import RemoteLibraryService
from app.application.streaming_service import StreamingService from app.application.streaming_service import StreamingService
from app.application.subsonic_auth_service import SubsonicAuthService from app.application.subsonic_auth_service import SubsonicAuthService
from app.application.upload_service import UploadService from app.application.upload_service import UploadService
@@ -28,6 +31,7 @@ from app.infrastructure.db import get_sessionmaker
from app.infrastructure.db.repositories import ( from app.infrastructure.db.repositories import (
SqlAlchemyAlbumRepository, SqlAlchemyAlbumRepository,
SqlAlchemyArtistRepository, SqlAlchemyArtistRepository,
SqlAlchemyDownloadJobRepository,
SqlAlchemyHistoryRepository, SqlAlchemyHistoryRepository,
SqlAlchemyLikeRepository, SqlAlchemyLikeRepository,
SqlAlchemyPlaylistRepository, SqlAlchemyPlaylistRepository,
@@ -35,9 +39,12 @@ from app.infrastructure.db.repositories import (
SqlAlchemyTrackRepository, SqlAlchemyTrackRepository,
SqlAlchemyUserRepository, SqlAlchemyUserRepository,
) )
from app.infrastructure.metadata.acoustid import AcoustIdHttpClient
from app.infrastructure.metadata.fingerprint import FpcalcFingerprinter
from app.infrastructure.metadata.tags import MutagenTagReader
from app.infrastructure.sources.registry import SourceRegistry, build_source_registry from app.infrastructure.sources.registry import SourceRegistry, build_source_registry
from app.infrastructure.storage.provider import get_file_storage from app.infrastructure.storage.provider import get_file_storage
from app.workers.queue import enqueue_enrich from app.workers.queue import enqueue_download, enqueue_enrich, enqueue_materialize
async def get_session() -> AsyncIterator[AsyncSession]: async def get_session() -> AsyncIterator[AsyncSession]:
@@ -132,8 +139,54 @@ def get_streaming_service(session: SessionDep, storage: FileStorageDep) -> Strea
) )
def get_metadata_service(session: SessionDep, storage: FileStorageDep) -> MetadataEnrichmentService:
"""Wires the §6.2 fingerprint/AcoustID adapters for read-only, inline use
(the metadata editor's "find matches" — §A7). The full pipeline (incl.
cover art) stays in the worker (`tasks/enrich_task.py`)."""
settings = get_settings()
api_key = settings.acoustid_api_key.get_secret_value() if settings.acoustid_api_key else None
acoustid = AcoustIdHttpClient(
api_key=api_key,
user_agent=settings.musicbrainz_user_agent,
api_url=settings.acoustid_api_url,
)
return MetadataEnrichmentService(
tracks=SqlAlchemyTrackRepository(session),
artists=SqlAlchemyArtistRepository(session),
albums=SqlAlchemyAlbumRepository(session),
storage=storage,
tag_reader=MutagenTagReader(),
fingerprinter=FpcalcFingerprinter(settings.fpcalc_path),
acoustid=acoustid,
acoustid_trust_score=settings.acoustid_trust_score,
)
def get_download_service(session: SessionDep, storage: FileStorageDep) -> DownloadService:
return DownloadService(
jobs=SqlAlchemyDownloadJobRepository(session),
tracks=SqlAlchemyTrackRepository(session),
artists=SqlAlchemyArtistRepository(session),
storage=storage,
enqueue_download=enqueue_download,
enqueue_enrich=enqueue_enrich,
)
def get_remote_library_service(session: SessionDep) -> RemoteLibraryService:
return RemoteLibraryService(
tracks=SqlAlchemyTrackRepository(session),
artists=SqlAlchemyArtistRepository(session),
jobs=SqlAlchemyDownloadJobRepository(session),
enqueue_materialize=enqueue_materialize,
)
UploadServiceDep = Annotated[UploadService, Depends(get_upload_service)] UploadServiceDep = Annotated[UploadService, Depends(get_upload_service)]
StreamingServiceDep = Annotated[StreamingService, Depends(get_streaming_service)] StreamingServiceDep = Annotated[StreamingService, Depends(get_streaming_service)]
MetadataServiceDep = Annotated[MetadataEnrichmentService, Depends(get_metadata_service)]
DownloadServiceDep = Annotated[DownloadService, Depends(get_download_service)]
RemoteLibraryServiceDep = Annotated[RemoteLibraryService, Depends(get_remote_library_service)]
# -- library repository deps --------------------------------------------------- # -- library repository deps ---------------------------------------------------
+33 -7
View File
@@ -13,9 +13,17 @@ from typing import Annotated
from fastapi import APIRouter, Header, Query from fastapi import APIRouter, Header, Query
from fastapi.responses import Response, StreamingResponse from fastapi.responses import Response, StreamingResponse
from app.api.deps import StreamingServiceDep, SubsonicUser, TrackRepoDep from app.api.covers import resolve_album_for_track, stream_cover
from app.api.rest.ids import decode_track, parse from app.api.deps import (
from app.domain.errors import NotFoundError AlbumRepoDep,
FileStorageDep,
StreamingServiceDep,
SubsonicUser,
TrackRepoDep,
)
from app.api.rest.ids import IdKind, decode_track, parse
from app.domain.entities.album import Album
from app.domain.errors import NotFoundError, StorageError
router = APIRouter() router = APIRouter()
@@ -57,7 +65,7 @@ async def download(
if track is None: if track is None:
raise NotFoundError("Song not found.") raise NotFoundError("Song not found.")
result = await service.open_stream(track_id, None) result = await service.open_stream(track_id, None)
filename = f"{track.title}.{track.file_format}" filename = f"{track.title}.{track.file_format or 'bin'}"
headers = { headers = {
"Content-Length": str(result.content_length), "Content-Length": str(result.content_length),
"Content-Disposition": f'attachment; filename="{filename}"', "Content-Disposition": f'attachment; filename="{filename}"',
@@ -69,10 +77,28 @@ async def download(
@router.api_route("/getCoverArt.view", methods=["GET", "POST"]) @router.api_route("/getCoverArt.view", methods=["GET", "POST"])
async def get_cover_art( async def get_cover_art(
_user: SubsonicUser, _user: SubsonicUser,
album_repo: AlbumRepoDep,
track_repo: TrackRepoDep,
storage: FileStorageDep,
id: Annotated[str, Query()], id: Annotated[str, Query()],
size: Annotated[int | None, Query()] = None, size: Annotated[int | None, Query()] = None,
) -> Response: ) -> Response:
# Validate the id shape so clients get a clean error on garbage, then serve a # Cover ids reuse the entity id: ``al-<uuid>`` (album) or ``tr-<uuid>``
# placeholder. TODO: stream real covers once the cover pipeline exists. # (track → its album). Unlike the native API, Subsonic clients expect an
parse(id) # image either way, so a missing cover falls back to a placeholder rather
# than 404. ``size`` is accepted but ignored (we serve the stored image).
kind, value = parse(id)
album: Album | None
if kind is IdKind.ALBUM:
album = await album_repo.get_by_id(value)
elif kind is IdKind.TRACK:
album = await resolve_album_for_track(track_repo, album_repo, value)
else:
album = None
if album is not None and album.cover_path:
try:
return await stream_cover(storage, album.cover_path)
except NotFoundError, StorageError:
pass
return Response(content=_PLACEHOLDER_PNG, media_type="image/png") return Response(content=_PLACEHOLDER_PNG, media_type="image/png")
+2 -2
View File
@@ -80,8 +80,8 @@ def song_dict(
"albumId": encode_album(track.album_id) if track.album_id is not None else None, "albumId": encode_album(track.album_id) if track.album_id is not None else None,
"artistId": encode_artist(track.artist_id), "artistId": encode_artist(track.artist_id),
"coverArt": cover, "coverArt": cover,
"size": track.file_size, "size": track.file_size or 0,
"contentType": content_type_for(track.file_format), "contentType": content_type_for(track.file_format or ""),
"suffix": track.file_format, "suffix": track.file_format,
"duration": track.duration_seconds, "duration": track.duration_seconds,
"year": track.year, "year": track.year,
+1
View File
@@ -13,4 +13,5 @@ class AlbumOut(BaseModel):
artist_name: str artist_name: str
year: int | None year: int | None
track_count: int track_count: int
has_cover: bool
created_at: dt.datetime created_at: dt.datetime
+59
View File
@@ -0,0 +1,59 @@
"""Schemas for the download job endpoints (§A5 download manager)."""
import datetime as dt
import uuid
from pydantic import BaseModel, Field
from app.domain.entities.download import DownloadJob
class DownloadCreate(BaseModel):
"""Request to download an item discovered on a fetch source."""
source: str
source_id: str = Field(min_length=1)
# Optional free-text the result came from — stored for display only.
query: str | None = None
class DownloadJobOut(BaseModel):
id: uuid.UUID
source: str
source_id: str | None
query: str | None
status: str
progress: float
error_message: str | None
retry_count: int
track_id: uuid.UUID | None
created_at: dt.datetime
updated_at: dt.datetime
@classmethod
def from_entity(cls, job: DownloadJob) -> DownloadJobOut:
return cls(
id=job.id,
source=job.source,
source_id=job.source_id,
query=job.query,
status=job.status,
progress=job.progress,
error_message=job.error_message,
retry_count=job.retry_count,
track_id=job.track_id,
created_at=job.created_at,
updated_at=job.updated_at,
)
class DownloadCreateResponse(BaseModel):
"""Result of requesting a download.
``already_in_library`` → the item was already imported (``track_id`` set, no
job). Otherwise ``job`` describes the queued (or already in-flight) download.
"""
already_in_library: bool
track_id: uuid.UUID | None
job: DownloadJobOut | None
+48
View File
@@ -0,0 +1,48 @@
"""Schemas for searching external (fetch) sources — the §A4 discover screen."""
import uuid
from pydantic import BaseModel
from app.domain.entities.track import Track
from app.domain.sources import SearchResult
class ExternalSearchResultOut(BaseModel):
source: str
source_id: str
title: str
artist: str | None
album: str | None
duration_seconds: int | None
thumbnail_url: str | None
# Remote browse (plan: Model C) — set when this hit is already saved in the
# library, so the UI can show "Play"/"Saved" instead of "Save to library".
in_library: bool
track_id: uuid.UUID | None
availability: str | None
@classmethod
def from_entity(
cls, r: SearchResult, *, existing: Track | None = None
) -> ExternalSearchResultOut:
return cls(
source=r.source,
source_id=r.source_id,
title=r.title,
artist=r.artist,
album=r.album,
duration_seconds=r.duration_seconds,
thumbnail_url=r.thumbnail_url,
in_library=existing is not None,
track_id=existing.id if existing is not None else None,
availability=existing.availability if existing is not None else None,
)
class ExternalSearchResponse(BaseModel):
"""Flat list of hits across one or more searchable sources, plus the names of
sources that were unavailable (so the UI can show a soft warning)."""
results: list[ExternalSearchResultOut]
searched_sources: list[str]
+45
View File
@@ -0,0 +1,45 @@
"""Storage / library statistics response schemas (§A6)."""
import datetime as dt
from pydantic import BaseModel
class DiskUsageOut(BaseModel):
total: int
used: int
free: int
class FormatBreakdownOut(BaseModel):
file_format: str
track_count: int
total_size: int
class GenreCountOut(BaseModel):
genre: str
track_count: int
class StorageStatsOut(BaseModel):
"""Everything the Storage screen needs in a single call."""
# library catalogue
total_tracks: int
total_artists: int
total_albums: int
total_size: int
total_duration_seconds: int
largest_track_size: int
earliest_added: dt.datetime | None
latest_added: dt.datetime | None
# breakdowns
by_format: list[FormatBreakdownOut]
by_metadata_status: dict[str, int]
by_source: dict[str, int]
top_genres: list[GenreCountOut]
# backing volume (``None`` for object-store backends)
disk: DiskUsageOut | None
+63 -3
View File
@@ -3,7 +3,9 @@
import datetime as dt import datetime as dt
import uuid import uuid
from pydantic import BaseModel from pydantic import BaseModel, Field
from app.api.schemas.download import DownloadJobOut
class TrackOut(BaseModel): class TrackOut(BaseModel):
@@ -14,10 +16,17 @@ class TrackOut(BaseModel):
album_id: uuid.UUID | None album_id: uuid.UUID | None
album_title: str | None album_title: str | None
duration_seconds: int | None duration_seconds: int | None
file_format: str file_format: str | None
file_size: int file_size: int | None
genre: str | None
year: int | None
track_number: int | None
metadata_status: str metadata_status: str
metadata_error: str | None
enriched_at: dt.datetime | None
availability: str
source: str source: str
has_cover: bool
created_at: dt.datetime created_at: dt.datetime
@@ -25,3 +34,54 @@ class TrackUpdate(BaseModel):
title: str | None = None title: str | None = None
genre: str | None = None genre: str | None = None
year: int | None = None year: int | None = None
class MetadataMatch(BaseModel):
"""One AcoustID candidate for the metadata editor's match picker (§A7)."""
acoustid: str
score: float
recording_mbid: str | None
release_group_mbid: str | None
title: str | None
artist: str | None
album: str | None
year: int | None
class MetadataMatchesOut(BaseModel):
items: list[MetadataMatch]
class MetadataApply(BaseModel):
"""Manual edits / accepted match applied via ``PUT /tracks/{id}/metadata``.
Sets ``metadata_status = manual`` (never overwritten by auto-enrichment)."""
title: str | None = None
artist_name: str | None = None
album_title: str | None = None
year: int | None = None
genre: str | None = None
track_number: int | None = None
class RemoteTrackSave(BaseModel):
"""Save a remote browse hit (§A4 discover) as a library placeholder —
``availability="remote"``, no audio until first play (plan: Model C)."""
source: str
source_id: str = Field(min_length=1)
title: str
artist: str | None = None
class MaterializeResponse(BaseModel):
"""Result of requesting that a placeholder track's audio be fetched.
``job`` is ``None`` when the track is already ``local`` — nothing to wait
for, the caller can stream immediately. Otherwise it's the (new or
already in-flight) job; poll ``GET /downloads/{job.id}`` until ``done``."""
track: TrackOut
job: DownloadJobOut | None
+22 -3
View File
@@ -1,11 +1,19 @@
"""Album endpoints.""" """Album endpoints."""
import uuid import uuid
from typing import Any
from fastapi import APIRouter, Query from fastapi import APIRouter, Query
from fastapi.responses import StreamingResponse
from app.api.deps import AlbumRepoDep, ArtistRepoDep, CurrentUser, TrackRepoDep from app.api.covers import stream_cover
from app.api.deps import (
AlbumRepoDep,
ArtistRepoDep,
CurrentUser,
FileStorageDep,
StreamUser,
TrackRepoDep,
)
from app.api.schemas.album import AlbumOut from app.api.schemas.album import AlbumOut
from app.api.schemas.pagination import PagedResponse from app.api.schemas.pagination import PagedResponse
from app.api.schemas.track import TrackOut from app.api.schemas.track import TrackOut
@@ -30,6 +38,7 @@ async def _build_album_out(
artist_name=artists[a.artist_id].name if a.artist_id in artists else "Unknown Artist", artist_name=artists[a.artist_id].name if a.artist_id in artists else "Unknown Artist",
year=a.year, year=a.year,
track_count=track_counts.get(a.id, 0), track_count=track_counts.get(a.id, 0),
has_cover=bool(a.cover_path),
created_at=a.created_at, created_at=a.created_at,
) )
for a in albums for a in albums
@@ -109,4 +118,14 @@ async def get_album_tracks(
@router.get("/{album_id}/cover") @router.get("/{album_id}/cover")
async def get_album_cover(album_id: uuid.UUID, _: CurrentUser) -> Any: ... async def get_album_cover(
album_id: uuid.UUID,
album_repo: AlbumRepoDep,
storage: FileStorageDep,
_: StreamUser,
) -> StreamingResponse:
# ``<img>`` can't send a bearer header → StreamUser accepts ``?token=``.
album = await album_repo.get_by_id(album_id)
if album is None or not album.cover_path:
raise NotFoundError("Cover not found.")
return await stream_cover(storage, album.cover_path)
+60 -18
View File
@@ -1,36 +1,78 @@
"""Download job endpoints. Heavy work is dispatched to arq workers.""" """Download job endpoints (§A5). Heavy work is dispatched to arq workers — these
handlers only create/inspect/cancel/retry job records."""
import uuid import uuid
from typing import Any
from fastapi import APIRouter from fastapi import APIRouter, Query, Response
from app.api.deps import CurrentUser, DownloadServiceDep
from app.api.schemas.download import DownloadCreate, DownloadCreateResponse, DownloadJobOut
from app.api.schemas.pagination import PagedResponse
router = APIRouter(prefix="/downloads", tags=["downloads"]) router = APIRouter(prefix="/downloads", tags=["downloads"])
@router.get("") @router.get("")
async def list_downloads() -> Any: ... async def list_downloads(
service: DownloadServiceDep,
user: CurrentUser,
status: str | None = Query(default=None),
mine: bool = Query(default=False),
limit: int = Query(50, ge=1, le=200),
offset: int = Query(0, ge=0),
) -> PagedResponse[DownloadJobOut]:
jobs, total = await service.list(
requested_by=user.id if mine else None,
status=status,
limit=limit,
offset=offset,
)
return PagedResponse(
items=[DownloadJobOut.from_entity(j) for j in jobs],
total=total,
limit=limit,
offset=offset,
)
@router.post("") @router.post("", status_code=202)
async def create_download() -> Any: ... async def create_download(
body: DownloadCreate,
service: DownloadServiceDep,
user: CurrentUser,
) -> DownloadCreateResponse:
result = await service.request(
source=body.source,
source_id=body.source_id,
query=body.query,
requested_by=user.id,
)
return DownloadCreateResponse(
already_in_library=result.already_in_library,
track_id=result.track_id,
job=DownloadJobOut.from_entity(result.job) if result.job is not None else None,
)
@router.get("/{job_id}") @router.get("/{job_id}")
async def get_download(job_id: uuid.UUID) -> Any: ... async def get_download(
job_id: uuid.UUID, service: DownloadServiceDep, _: CurrentUser
) -> DownloadJobOut:
job = await service.get(job_id)
return DownloadJobOut.from_entity(job)
@router.delete("/{job_id}") @router.delete("/{job_id}", status_code=204)
async def cancel_download(job_id: uuid.UUID) -> Any: ... async def cancel_download(
job_id: uuid.UUID, service: DownloadServiceDep, _: CurrentUser
) -> Response:
await service.cancel(job_id)
return Response(status_code=204)
@router.post("/{job_id}/retry") @router.post("/{job_id}/retry")
async def retry_download(job_id: uuid.UUID) -> Any: ... async def retry_download(
job_id: uuid.UUID, service: DownloadServiceDep, _: CurrentUser
) -> DownloadJobOut:
@router.post("/pause") job = await service.retry(job_id)
async def pause_downloads() -> Any: ... return DownloadJobOut.from_entity(job)
@router.post("/resume")
async def resume_downloads() -> Any: ...
+27 -4
View File
@@ -1,12 +1,11 @@
"""Search endpoints: global and library-scoped.""" """Search endpoints: global and library-scoped."""
from typing import Any
from fastapi import APIRouter, Query from fastapi import APIRouter, Query
from app.api.deps import AlbumRepoDep, ArtistRepoDep, CurrentUser, TrackRepoDep from app.api.deps import AlbumRepoDep, ArtistRepoDep, CurrentUser, SourceRegistryDep, TrackRepoDep
from app.api.schemas.album import AlbumOut from app.api.schemas.album import AlbumOut
from app.api.schemas.artist import ArtistOut from app.api.schemas.artist import ArtistOut
from app.api.schemas.external_search import ExternalSearchResponse, ExternalSearchResultOut
from app.api.schemas.search import LibrarySearchResponse from app.api.schemas.search import LibrarySearchResponse
from app.api.schemas.track import TrackOut from app.api.schemas.track import TrackOut
from app.api.v1.albums import _build_album_out from app.api.v1.albums import _build_album_out
@@ -16,7 +15,31 @@ router = APIRouter(prefix="/search", tags=["search"])
@router.get("") @router.get("")
async def search(_: CurrentUser) -> Any: ... async def search(
_: CurrentUser,
registry: SourceRegistryDep,
track_repo: TrackRepoDep,
q: str = Query(min_length=1),
limit: int = Query(20, ge=1, le=50),
) -> ExternalSearchResponse:
"""Search every available fetch source and merge the hits (§A4 discover).
A source that is down contributes nothing rather than failing the whole
request (graceful degradation); only available sources are reported as
searched. Each hit is checked against the library by ``(source,
source_id)`` so the UI can show "Saved"/"Play" instead of "Save to
library" without a separate round-trip (remote browse, plan: Model C)."""
results: list[ExternalSearchResultOut] = []
searched: list[str] = []
for backend in registry.searchables():
if not backend.is_available():
continue
searched.append(backend.name)
hits = await backend.search(q, limit=limit)
for h in hits:
existing = await track_repo.get_by_source(h.source, h.source_id)
results.append(ExternalSearchResultOut.from_entity(h, existing=existing))
return ExternalSearchResponse(results=results, searched_sources=searched)
@router.get("/library") @router.get("/library")
+23 -10
View File
@@ -1,14 +1,13 @@
"""External source endpoints: enumerate sources and trigger imports. """External source endpoints: enumerate sources, search, and trigger imports.
Listing/health are read-only (any authenticated user). Scanning a source is an Listing/health/search are read-only (any authenticated user). Scanning a source
admin action and runs in a worker — the endpoint only enqueues it. is an admin action and runs in a worker — the endpoint only enqueues it.
""" """
from typing import Any from fastapi import APIRouter, Query
from fastapi import APIRouter from app.api.deps import CurrentUser, SourceRegistryDep, SuperUser, TrackRepoDep
from app.api.schemas.external_search import ExternalSearchResponse, ExternalSearchResultOut
from app.api.deps import CurrentUser, SourceRegistryDep, SuperUser
from app.api.schemas.source import ScanResponse, SourceHealthOut, SourceInfoOut from app.api.schemas.source import ScanResponse, SourceHealthOut, SourceInfoOut
from app.domain.errors import DependencyUnavailableError from app.domain.errors import DependencyUnavailableError
from app.workers.queue import enqueue from app.workers.queue import enqueue
@@ -39,6 +38,20 @@ async def source_health(
@router.get("/{source}/search") @router.get("/{source}/search")
async def search_source(source: str, _: CurrentUser) -> Any: async def search_source(
# Search is for fetch-style sources (youtube, …) — not yet implemented. source: str,
... _: CurrentUser,
registry: SourceRegistryDep,
track_repo: TrackRepoDep,
q: str = Query(min_length=1),
limit: int = Query(20, ge=1, le=50),
) -> ExternalSearchResponse:
backend = registry.searchable(source) # 404 if unknown, 422 if not searchable
if not backend.is_available():
raise DependencyUnavailableError(f"Source {source!r} is not available.")
results = await backend.search(q, limit=limit)
out: list[ExternalSearchResultOut] = []
for r in results:
existing = await track_repo.get_by_source(r.source, r.source_id)
out.append(ExternalSearchResultOut.from_entity(r, existing=existing))
return ExternalSearchResponse(results=out, searched_sources=[source])
+59 -1
View File
@@ -4,11 +4,69 @@ from typing import Any
from fastapi import APIRouter from fastapi import APIRouter
from app.api.deps import (
AlbumRepoDep,
ArtistRepoDep,
CurrentUser,
FileStorageDep,
TrackRepoDep,
)
from app.api.schemas.storage import (
DiskUsageOut,
FormatBreakdownOut,
GenreCountOut,
StorageStatsOut,
)
router = APIRouter(prefix="/storage", tags=["storage"]) router = APIRouter(prefix="/storage", tags=["storage"])
# How many of the most common genres the dashboard surfaces.
_TOP_GENRES = 8
@router.get("") @router.get("")
async def get_storage_stats() -> Any: ... async def get_storage_stats(
track_repo: TrackRepoDep,
artist_repo: ArtistRepoDep,
album_repo: AlbumRepoDep,
storage: FileStorageDep,
_: CurrentUser,
) -> StorageStatsOut:
"""Library + disk statistics for the Storage dashboard (§A6).
Aggregates come from the catalogue (cheap GROUP BYs); ``disk`` reflects the
real backing volume and is ``None`` for backends without a fixed-capacity
disk (e.g. object stores)."""
stats = await track_repo.library_stats()
total_artists = await artist_repo.count(q=None)
total_albums = await album_repo.count(artist_id=None, q=None)
genres = await track_repo.genres()
disk = await storage.disk_usage()
return StorageStatsOut(
total_tracks=stats.total_tracks,
total_artists=total_artists,
total_albums=total_albums,
total_size=stats.total_size,
total_duration_seconds=stats.total_duration_seconds,
largest_track_size=stats.largest_track_size,
earliest_added=stats.earliest_added,
latest_added=stats.latest_added,
by_format=[
FormatBreakdownOut(
file_format=f.file_format,
track_count=f.track_count,
total_size=f.total_size,
)
for f in stats.by_format
],
by_metadata_status=stats.by_metadata_status,
by_source=stats.by_source,
top_genres=[
GenreCountOut(genre=genre, track_count=count) for genre, count in genres[:_TOP_GENRES]
],
disk=DiskUsageOut(total=disk.total, used=disk.used, free=disk.free) if disk else None,
)
@router.get("/duplicates") @router.get("/duplicates")
+175 -7
View File
@@ -4,10 +4,30 @@ import uuid
from typing import Any from typing import Any
from fastapi import APIRouter, Query, Response from fastapi import APIRouter, Query, Response
from fastapi.responses import StreamingResponse
from app.api.deps import AlbumRepoDep, ArtistRepoDep, CurrentUser, FileStorageDep, TrackRepoDep from app.api.covers import resolve_album_for_track, stream_cover
from app.api.deps import (
AlbumRepoDep,
ArtistRepoDep,
CurrentUser,
FileStorageDep,
MetadataServiceDep,
RemoteLibraryServiceDep,
StreamUser,
TrackRepoDep,
)
from app.api.schemas.download import DownloadJobOut
from app.api.schemas.pagination import PagedResponse from app.api.schemas.pagination import PagedResponse
from app.api.schemas.track import TrackOut, TrackUpdate from app.api.schemas.track import (
MaterializeResponse,
MetadataApply,
MetadataMatch,
MetadataMatchesOut,
RemoteTrackSave,
TrackOut,
TrackUpdate,
)
from app.domain.entities.album import Album from app.domain.entities.album import Album
from app.domain.entities.track import Artist, Track from app.domain.entities.track import Artist, Track
from app.domain.errors import NotFoundError from app.domain.errors import NotFoundError
@@ -32,8 +52,15 @@ async def _build_track_out(
duration_seconds=t.duration_seconds, duration_seconds=t.duration_seconds,
file_format=t.file_format, file_format=t.file_format,
file_size=t.file_size, file_size=t.file_size,
genre=t.genre,
year=t.year,
track_number=t.track_number,
metadata_status=t.metadata_status, metadata_status=t.metadata_status,
metadata_error=t.metadata_error,
enriched_at=t.enriched_at,
availability=t.availability,
source=t.source, source=t.source,
has_cover=bool(t.album_id and albums.get(t.album_id) and albums[t.album_id].cover_path),
created_at=t.created_at, created_at=t.created_at,
) )
for t in tracks for t in tracks
@@ -49,6 +76,7 @@ async def list_tracks(
artist_id: uuid.UUID | None = None, artist_id: uuid.UUID | None = None,
album_id: uuid.UUID | None = None, album_id: uuid.UUID | None = None,
q: str | None = None, q: str | None = None,
source: str | None = Query(None, max_length=32),
sort_by: str = Query("created_at", pattern="^(title|created_at|artist)$"), sort_by: str = Query("created_at", pattern="^(title|created_at|artist)$"),
order: str = Query("desc", pattern="^(asc|desc)$"), order: str = Query("desc", pattern="^(asc|desc)$"),
limit: int = Query(50, ge=1, le=200), limit: int = Query(50, ge=1, le=200),
@@ -58,12 +86,13 @@ async def list_tracks(
artist_id=artist_id, artist_id=artist_id,
album_id=album_id, album_id=album_id,
q=q, q=q,
source=source,
sort_by=sort_by, sort_by=sort_by,
order=order, order=order,
limit=limit, limit=limit,
offset=offset, offset=offset,
) )
total = await track_repo.count(artist_id=artist_id, album_id=album_id, q=q) total = await track_repo.count(artist_id=artist_id, album_id=album_id, q=q, source=source)
artist_ids = list({t.artist_id for t in tracks}) artist_ids = list({t.artist_id for t in tracks})
album_ids = list({t.album_id for t in tracks if t.album_id is not None}) album_ids = list({t.album_id for t in tracks if t.album_id is not None})
@@ -74,6 +103,57 @@ async def list_tracks(
return PagedResponse(items=items, total=total, limit=limit, offset=offset) return PagedResponse(items=items, total=total, limit=limit, offset=offset)
@router.post("/remote", status_code=201)
async def save_remote_track(
body: RemoteTrackSave,
service: RemoteLibraryServiceDep,
artist_repo: ArtistRepoDep,
album_repo: AlbumRepoDep,
user: CurrentUser,
) -> TrackOut:
"""Save a remote browse hit (§A4 discover) as a library placeholder —
no audio is fetched yet (plan: Model C). Idempotent on ``(source,
source_id)``: saving an already-saved hit returns the existing track."""
track = await service.save_remote(
source=body.source,
source_id=body.source_id,
title=body.title,
artist=body.artist,
added_by=user.id,
)
artists = {a.id: a for a in await artist_repo.get_many([track.artist_id])}
album_ids = [track.album_id] if track.album_id else []
albums = {a.id: a for a in await album_repo.get_many(album_ids)}
items = await _build_track_out([track], artists, albums)
return items[0]
@router.post("/{track_id}/materialize")
async def materialize_track(
track_id: uuid.UUID,
service: RemoteLibraryServiceDep,
artist_repo: ArtistRepoDep,
album_repo: AlbumRepoDep,
user: CurrentUser,
) -> MaterializeResponse:
"""Fetch a placeholder track's audio on demand (plan: Model C lazy
materialization). Already-local tracks return ``job=None`` — nothing to
wait for. Otherwise poll ``GET /downloads/{job.id}`` until ``done``, then
stream as usual."""
outcome = await service.request_materialize(track_id, requested_by=user.id)
artists = {a.id: a for a in await artist_repo.get_many([outcome.track.artist_id])}
album_ids = [outcome.track.album_id] if outcome.track.album_id else []
albums = {a.id: a for a in await album_repo.get_many(album_ids)}
track_out = (await _build_track_out([outcome.track], artists, albums))[0]
return MaterializeResponse(
track=track_out,
job=DownloadJobOut.from_entity(outcome.job) if outcome.job is not None else None,
)
@router.get("/{track_id}") @router.get("/{track_id}")
async def get_track( async def get_track(
track_id: uuid.UUID, track_id: uuid.UUID,
@@ -131,7 +211,8 @@ async def delete_track(
if track is None: if track is None:
raise NotFoundError(f"Track {track_id} not found.") raise NotFoundError(f"Track {track_id} not found.")
await track_repo.delete(track_id) await track_repo.delete(track_id)
await storage.delete(track.storage_uri) if track.storage_uri is not None:
await storage.delete(track.storage_uri)
return Response(status_code=204) return Response(status_code=204)
@@ -144,7 +225,19 @@ async def optimize_track(track_id: uuid.UUID, _: CurrentUser) -> Any: ...
@router.get("/{track_id}/cover") @router.get("/{track_id}/cover")
async def get_track_cover(track_id: uuid.UUID, _: CurrentUser) -> Any: ... async def get_track_cover(
track_id: uuid.UUID,
track_repo: TrackRepoDep,
album_repo: AlbumRepoDep,
storage: FileStorageDep,
_: StreamUser,
) -> StreamingResponse:
# A track's cover is its album's cover. ``<img>`` can't send a bearer
# header → StreamUser accepts ``?token=``.
album = await resolve_album_for_track(track_repo, album_repo, track_id)
if album is None or not album.cover_path:
raise NotFoundError("Cover not found.")
return await stream_cover(storage, album.cover_path)
@router.post("/{track_id}/metadata/enrich") @router.post("/{track_id}/metadata/enrich")
@@ -163,8 +256,83 @@ async def enrich_metadata(
@router.get("/{track_id}/metadata/matches") @router.get("/{track_id}/metadata/matches")
async def get_metadata_matches(track_id: uuid.UUID, _: CurrentUser) -> Any: ... async def get_metadata_matches(
track_id: uuid.UUID,
track_repo: TrackRepoDep,
metadata_service: MetadataServiceDep,
_: CurrentUser,
) -> MetadataMatchesOut:
"""AcoustID candidates for the metadata editor's match picker (§A7).
Runs the fingerprint lookup inline (single track, user-triggered) and
never mutates the track. Degrades to an empty list if fpcalc/AcoustID are
unavailable or no match is found.
"""
track = await track_repo.get_by_id(track_id)
if track is None:
raise NotFoundError(f"Track {track_id} not found.")
matches = await metadata_service.find_matches(track_id)
return MetadataMatchesOut(
items=[
MetadataMatch(
acoustid=m.acoustid,
score=m.score,
recording_mbid=m.recording_mbid,
release_group_mbid=m.release_group_mbid,
title=m.title,
artist=m.artist,
album=m.album,
year=m.year,
)
for m in matches
]
)
@router.put("/{track_id}/metadata") @router.put("/{track_id}/metadata")
async def set_metadata(track_id: uuid.UUID, _: CurrentUser) -> Any: ... async def set_metadata(
track_id: uuid.UUID,
body: MetadataApply,
track_repo: TrackRepoDep,
artist_repo: ArtistRepoDep,
album_repo: AlbumRepoDep,
_: CurrentUser,
) -> TrackOut:
"""Apply manual edits or an accepted AcoustID match (§A7). Sets
``metadata_status = manual`` — never overwritten by auto-enrichment."""
track = await track_repo.get_by_id(track_id)
if track is None:
raise NotFoundError(f"Track {track_id} not found.")
artist_id: uuid.UUID | None = None
if body.artist_name:
artist = await artist_repo.get_or_create(body.artist_name)
artist_id = artist.id
album_id: uuid.UUID | None = None
if body.album_title:
album = await album_repo.get_or_create(
title=body.album_title,
artist_id=artist_id or track.artist_id,
year=body.year,
musicbrainz_id=None,
)
album_id = album.id
track = await track_repo.update(
track_id,
title=body.title,
genre=body.genre,
year=body.year,
artist_id=artist_id,
album_id=album_id,
track_number=body.track_number,
)
artist_ids = [track.artist_id]
album_ids = [track.album_id] if track.album_id else []
artists = {a.id: a for a in await artist_repo.get_many(artist_ids)}
albums = {a.id: a for a in await album_repo.get_many(album_ids)}
items = await _build_track_out([track], artists, albums)
return items[0]
+183
View File
@@ -0,0 +1,183 @@
"""DownloadService — request external downloads and import their results.
Two roles (plan §6.1):
* **Request side** (HTTP): validate + dedup a download request, create a
``queued`` job, and enqueue the worker. Dedup is on ``(source, source_id)``
against both the library (already imported) and in-flight jobs (a double-click
must not queue twice) — idempotency per CLAUDE.md.
* **Worker side**: ``store_result`` turns a backend's :class:`DownloadResult`
into a managed file + minimal ``pending`` track (sibling of
:class:`~app.application.import_service.LibraryImportService`); enrichment
(§6.2) fills the rest.
The fingerprint-level dedup (a different id that turns out to be the same audio)
happens later in enrichment, where the fingerprint is computed.
"""
import contextlib
import uuid
from collections.abc import Awaitable, Callable
from dataclasses import dataclass
import anyio
from app.core.logging import get_logger
from app.domain.entities.download import DownloadJob
from app.domain.errors import NotFoundError, ValidationError
from app.domain.ports import (
ArtistRepository,
DownloadJobRepository,
FileStorage,
TrackRepository,
)
from app.domain.sources import DownloadResult
log = get_logger(__name__)
_UNKNOWN_ARTIST = "Unknown Artist"
# (job_id) -> None — enqueue the download worker, deferred so the job row is
# committed before the worker reads it (same pattern as enrich).
DownloadEnqueuer = Callable[[uuid.UUID], Awaitable[None]]
EnrichEnqueuer = Callable[[uuid.UUID], Awaitable[None]]
@dataclass(frozen=True)
class DownloadRequest:
"""Outcome of asking for a download.
Exactly one of the three states holds: the item is already in the library
(``track_id`` set, ``already_in_library``), a job already covers it / was
just created (``job`` set), so the UI can route to the download manager.
"""
job: DownloadJob | None
track_id: uuid.UUID | None
already_in_library: bool
class DownloadService:
def __init__(
self,
*,
jobs: DownloadJobRepository,
tracks: TrackRepository,
artists: ArtistRepository,
storage: FileStorage,
enqueue_download: DownloadEnqueuer | None = None,
enqueue_enrich: EnrichEnqueuer | None = None,
) -> None:
self._jobs = jobs
self._tracks = tracks
self._artists = artists
self._storage = storage
self._enqueue_download = enqueue_download
self._enqueue_enrich = enqueue_enrich
# -- request side ---------------------------------------------------------
async def request(
self,
*,
source: str,
source_id: str,
query: str | None,
requested_by: uuid.UUID | None,
) -> DownloadRequest:
source_id = source_id.strip()
if not source_id:
raise ValidationError("A source_id is required to download.")
existing = await self._tracks.get_by_source(source, source_id)
if existing is not None:
return DownloadRequest(job=None, track_id=existing.id, already_in_library=True)
active = await self._jobs.get_active_for_source(source, source_id)
if active is not None:
return DownloadRequest(job=active, track_id=None, already_in_library=False)
job = await self._jobs.add(
source=source,
source_id=source_id,
query=query,
requested_by=requested_by,
)
if self._enqueue_download is not None:
await self._enqueue_download(job.id)
return DownloadRequest(job=job, track_id=None, already_in_library=False)
async def list(
self,
*,
requested_by: uuid.UUID | None,
status: str | None,
limit: int,
offset: int,
) -> tuple[list[DownloadJob], int]:
jobs = await self._jobs.list(
requested_by=requested_by, status=status, limit=limit, offset=offset
)
total = await self._jobs.count(requested_by=requested_by, status=status)
return jobs, total
async def get(self, job_id: uuid.UUID) -> DownloadJob:
job = await self._jobs.get_by_id(job_id)
if job is None:
raise NotFoundError(f"Download job {job_id} not found.")
return job
async def cancel(self, job_id: uuid.UUID) -> None:
"""Remove the job record. True mid-flight cancellation of an in-progress
yt-dlp download is out of scope (MVP); the worker tolerates a vanished
job row (its status writes become no-ops)."""
job = await self._jobs.get_by_id(job_id)
if job is None:
raise NotFoundError(f"Download job {job_id} not found.")
await self._jobs.delete(job_id)
async def retry(self, job_id: uuid.UUID) -> DownloadJob:
job = await self.get(job_id)
await self._jobs.set_status(job_id, status="queued", error_message=None)
if self._enqueue_download is not None:
await self._enqueue_download(job_id)
refreshed = await self._jobs.get_by_id(job_id)
return refreshed if refreshed is not None else job
# -- worker side ----------------------------------------------------------
async def store_result(
self,
*,
source: str,
result: DownloadResult,
requested_by: uuid.UUID | None,
) -> uuid.UUID:
"""Store a freshly downloaded file and create a minimal ``pending`` track.
Returns the new track id (the caller enqueues enrichment after commit).
The temp file produced by the backend is always removed."""
track_id = uuid.uuid4()
key = f"tracks/{str(track_id)[:2]}/{track_id}.{result.file_format}"
try:
await self._storage.save_file(key, result.path)
try:
artist = await self._artists.get_or_create(_UNKNOWN_ARTIST)
await self._tracks.add(
id=track_id,
title=result.suggested_title,
artist_id=artist.id,
storage_uri=key,
file_format=result.file_format,
file_size=result.file_size,
source=source,
source_id=result.source_id,
metadata_status="pending",
added_by=requested_by,
)
except Exception:
with contextlib.suppress(Exception):
await self._storage.delete(key)
raise
finally:
with contextlib.suppress(Exception):
await anyio.Path(result.path).unlink(missing_ok=True)
return track_id
+145 -13
View File
@@ -12,10 +12,14 @@ Invariants (plan §6.2, CLAUDE.md):
- **Idempotent** — re-running only fills gaps; ``apply_enrichment`` never erases. - **Idempotent** — re-running only fills gaps; ``apply_enrichment`` never erases.
""" """
import tempfile
import uuid import uuid
from dataclasses import dataclass from dataclasses import dataclass
from pathlib import Path
from app.core.logging import get_logger from app.core.logging import get_logger
from app.domain.entities.album import Album
from app.domain.entities.cover import CoverArt
from app.domain.entities.metadata import AudioTags, RecordingMatch from app.domain.entities.metadata import AudioTags, RecordingMatch
from app.domain.ports import ( from app.domain.ports import (
AcoustIdClient, AcoustIdClient,
@@ -23,6 +27,8 @@ from app.domain.ports import (
ArtistRepository, ArtistRepository,
AudioFingerprinter, AudioFingerprinter,
AudioTagReader, AudioTagReader,
CoverArtExtractor,
CoverArtProvider,
FileStorage, FileStorage,
TrackRepository, TrackRepository,
) )
@@ -50,6 +56,9 @@ class MetadataEnrichmentService:
tag_reader: AudioTagReader, tag_reader: AudioTagReader,
fingerprinter: AudioFingerprinter, fingerprinter: AudioFingerprinter,
acoustid: AcoustIdClient, acoustid: AcoustIdClient,
cover_extractor: CoverArtExtractor | None = None,
cover_provider: CoverArtProvider | None = None,
acoustid_trust_score: float = 0.85,
) -> None: ) -> None:
self._tracks = tracks self._tracks = tracks
self._artists = artists self._artists = artists
@@ -58,6 +67,9 @@ class MetadataEnrichmentService:
self._tag_reader = tag_reader self._tag_reader = tag_reader
self._fingerprinter = fingerprinter self._fingerprinter = fingerprinter
self._acoustid = acoustid self._acoustid = acoustid
self._cover_extractor = cover_extractor
self._cover_provider = cover_provider
self._acoustid_trust_score = acoustid_trust_score
async def enrich(self, track_id: uuid.UUID) -> EnrichmentResult: async def enrich(self, track_id: uuid.UUID) -> EnrichmentResult:
track = await self._tracks.get_by_id(track_id) track = await self._tracks.get_by_id(track_id)
@@ -67,20 +79,39 @@ class MetadataEnrichmentService:
if track.metadata_status == "manual": if track.metadata_status == "manual":
log.info("enrich_skip_manual", track_id=str(track_id)) log.info("enrich_skip_manual", track_id=str(track_id))
return EnrichmentResult(track_id=track_id, status="skipped") return EnrichmentResult(track_id=track_id, status="skipped")
storage_uri = track.storage_uri
if storage_uri is None:
log.info("enrich_skip_remote", track_id=str(track_id))
return EnrichmentResult(track_id=track_id, status="skipped")
tags = await self._read_local(track.storage_uri) tags = await self._read_local(storage_uri)
match = await self._identify(track.storage_uri) match = await self._identify(storage_uri)
# Merge sources: prefer embedded tags, fall back to the AcoustID match. # Merge order is tag-first by default — embedded tags fix the common
# ``title`` is guaranteed non-None by the existing track title; the rest # well-tagged offline case. But a *high-confidence* AcoustID match is the
# stay None when neither source has them. # more trustworthy identity (downloaded files routinely carry junk tags
# like "Music Track"/"Sound_12345"), so above the trust threshold the
# acoustic match wins for the identity fields and tags become fallback.
tag_title = tags.title if tags else None tag_title = tags.title if tags else None
tag_artist = tags.artist if tags else None tag_artist = tags.artist if tags else None
tag_album = tags.album if tags else None tag_album = tags.album if tags else None
title = _opt_str(tag_title, match.title if match else None) or track.title match_title = match.title if match else None
artist_name = _opt_str(tag_artist, match.artist if match else None) match_artist = match.artist if match else None
album_title = _opt_str(tag_album, match.album if match else None) match_album = match.album if match else None
year = _first_int(tags.year if tags else None, match.year if match else None) match_year = match.year if match else None
tag_year = tags.year if tags else None
trust_match = match is not None and match.score >= self._acoustid_trust_score
if trust_match:
title = _opt_str(match_title, tag_title) or track.title
artist_name = _opt_str(match_artist, tag_artist)
album_title = _opt_str(match_album, tag_album)
year = _first_int(match_year, tag_year)
else:
title = _opt_str(tag_title, match_title) or track.title
artist_name = _opt_str(tag_artist, match_artist)
album_title = _opt_str(tag_album, match_album)
year = _first_int(tag_year, match_year)
genre = tags.genre if tags else None genre = tags.genre if tags else None
track_number = tags.track_number if tags else None track_number = tags.track_number if tags else None
duration = _first_int( duration = _first_int(
@@ -92,10 +123,21 @@ class MetadataEnrichmentService:
acoustid_id = match.acoustid if match else None acoustid_id = match.acoustid if match else None
artist_id = await self._resolve_artist(artist_name, fallback=track.artist_id) artist_id = await self._resolve_artist(artist_name, fallback=track.artist_id)
album_id = await self._resolve_album(album_title, artist_id=artist_id, year=year, mbid=mbid) album = await self._resolve_album(album_title, artist_id=artist_id, year=year, mbid=mbid)
album_id = album.id if album is not None else None
if album is not None:
await self._resolve_cover(
album,
storage_uri=storage_uri,
release_group_mbid=match.release_group_mbid if match else None,
)
identified = bool(artist_name) or album_id is not None or mbid is not None identified = bool(artist_name) or album_id is not None or mbid is not None
status = "enriched" if identified else "failed" status = "enriched" if identified else "failed"
# On a clean "no identity" outcome, record *why* so the UI shows a reason
# rather than a bare "failed". A successful run clears any prior error.
metadata_error = None if identified else self._no_match_reason()
await self._tracks.apply_enrichment( await self._tracks.apply_enrichment(
track_id, track_id,
@@ -110,10 +152,45 @@ class MetadataEnrichmentService:
acoustid_fingerprint=acoustid_id, acoustid_fingerprint=acoustid_id,
musicbrainz_id=mbid, musicbrainz_id=mbid,
metadata_status=status, metadata_status=status,
metadata_error=metadata_error,
) )
log.info("enrich_complete", track_id=str(track_id), status=status, mbid=mbid) log.info("enrich_complete", track_id=str(track_id), status=status, mbid=mbid)
return EnrichmentResult(track_id=track_id, status=status, matched_mbid=mbid) return EnrichmentResult(track_id=track_id, status=status, matched_mbid=mbid)
def _no_match_reason(self) -> str:
"""Explain a ``failed`` (no-identity) run in terms a user can act on:
which optional identification step was unavailable, if any."""
if not self._fingerprinter.is_available():
return "No metadata match: audio fingerprinting (fpcalc) is unavailable."
if not self._acoustid.is_available():
return "No metadata match: AcoustID lookup is unavailable (no API key)."
return "No metadata match found in tags or AcoustID."
async def find_matches(self, track_id: uuid.UUID) -> list[RecordingMatch]:
"""AcoustID candidates for the metadata editor's match picker (§A7).
Read-only — unlike :meth:`enrich`, never touches the track. Runs
inline (single track, user-triggered) rather than via the worker.
Degrades to ``[]`` whenever fingerprinting/AcoustID is unavailable or
the file can't be read, same as the enrichment pipeline.
"""
track = await self._tracks.get_by_id(track_id)
if track is None:
return []
if not self._acoustid.is_available() or not self._fingerprinter.is_available():
return []
if track.storage_uri is None:
return []
try:
async with self._storage.as_local_path(track.storage_uri) as path:
fingerprint = await self._fingerprinter.calculate(path)
if fingerprint is None:
return []
return await self._acoustid.lookup_all(fingerprint)
except Exception:
log.warning("find_matches_failed", track_id=str(track_id))
return []
async def _read_local(self, storage_uri: str) -> AudioTags | None: async def _read_local(self, storage_uri: str) -> AudioTags | None:
try: try:
async with self._storage.as_local_path(storage_uri) as path: async with self._storage.as_local_path(storage_uri) as path:
@@ -148,16 +225,71 @@ class MetadataEnrichmentService:
artist_id: uuid.UUID, artist_id: uuid.UUID,
year: int | None, year: int | None,
mbid: str | None, mbid: str | None,
) -> uuid.UUID | None: ) -> Album | None:
if not title: if not title:
return None return None
album = await self._albums.get_or_create( return await self._albums.get_or_create(
title=title, title=title,
artist_id=artist_id, artist_id=artist_id,
year=year, year=year,
musicbrainz_id=mbid, musicbrainz_id=mbid,
) )
return album.id
async def _resolve_cover(
self,
album: Album,
*,
storage_uri: str,
release_group_mbid: str | None,
) -> None:
"""Fill in an album cover when it has none. Source order mirrors the
tag-first pipeline: embedded artwork (offline) → Cover Art Archive
(network, by release-group). Best-effort — any failure is swallowed so a
missing cover never affects enrichment status."""
if album.cover_path:
return # already has one — never overwrite (idempotent)
cover = await self._extract_cover(storage_uri)
if cover is None:
cover = await self._fetch_cover(release_group_mbid)
if cover is None:
return
try:
key = await self._save_cover(album.id, cover)
await self._albums.set_cover_path(album.id, key)
log.info("cover_resolved", album_id=str(album.id), content_type=cover.content_type)
except Exception:
log.warning("cover_save_failed", album_id=str(album.id))
async def _extract_cover(self, storage_uri: str) -> CoverArt | None:
if self._cover_extractor is None:
return None
try:
async with self._storage.as_local_path(storage_uri) as path:
return await self._cover_extractor.extract(path)
except Exception:
log.warning("cover_extract_step_failed", storage_uri=storage_uri)
return None
async def _fetch_cover(self, release_group_mbid: str | None) -> CoverArt | None:
if self._cover_provider is None or not release_group_mbid:
return None
if not self._cover_provider.is_available():
return None
try:
return await self._cover_provider.fetch_release_group(release_group_mbid)
except Exception:
log.warning("cover_fetch_step_failed", release_group=release_group_mbid)
return None
async def _save_cover(self, album_id: uuid.UUID, cover: CoverArt) -> str:
key = f"covers/{album_id}.{cover.extension}"
with tempfile.NamedTemporaryFile(suffix=f".{cover.extension}") as tmp:
tmp.write(cover.data)
tmp.flush()
await self._storage.save_file(key, Path(tmp.name))
return key
def _opt_str(*values: str | None) -> str | None: def _opt_str(*values: str | None) -> str | None:
+122
View File
@@ -0,0 +1,122 @@
"""RemoteLibraryService — save-to-library + materialize for remote browse hits
(plan: Model C, on-demand YTM library).
Two operations:
* ``save_remote`` persists a placeholder ``Track`` (``availability="remote"``,
``storage_uri=None``) for a remote browse hit. Idempotent on
``(source, source_id)`` — CLAUDE.md dedup.
* ``request_materialize`` lazily fills a placeholder's audio in place: it
creates (or reuses) a ``DownloadJob`` pointing at the existing track and
enqueues the materialize worker, which calls ``TrackRepository.materialize``
on completion. ``track.id`` never changes (CLAUDE.md), so likes/playlists/
queue entries referencing the placeholder keep working once it's filled in.
"""
import uuid
from collections.abc import Awaitable, Callable
from dataclasses import dataclass
from app.domain.entities.download import DownloadJob
from app.domain.entities.track import Track
from app.domain.errors import NotFoundError, ValidationError
from app.domain.ports import ArtistRepository, DownloadJobRepository, TrackRepository
_UNKNOWN_ARTIST = "Unknown Artist"
# (job_id) -> None — enqueue the materialize worker, same deferred pattern as
# download/enrich enqueuers.
MaterializeEnqueuer = Callable[[uuid.UUID], Awaitable[None]]
@dataclass(frozen=True)
class MaterializeOutcome:
"""Result of requesting materialization.
``job`` is ``None`` when the track is already ``local`` — nothing to do,
the caller can stream immediately. Otherwise it's the (new or already
in-flight) job filling the placeholder."""
track: Track
job: DownloadJob | None
class RemoteLibraryService:
def __init__(
self,
*,
tracks: TrackRepository,
artists: ArtistRepository,
jobs: DownloadJobRepository,
enqueue_materialize: MaterializeEnqueuer | None = None,
) -> None:
self._tracks = tracks
self._artists = artists
self._jobs = jobs
self._enqueue_materialize = enqueue_materialize
async def save_remote(
self,
*,
source: str,
source_id: str,
title: str,
artist: str | None,
added_by: uuid.UUID | None,
) -> Track:
"""Persist a placeholder for a remote browse hit. Idempotent: a hit
already saved (by ``(source, source_id)``) is returned as-is."""
source_id = source_id.strip()
if not source_id:
raise ValidationError("A source_id is required to save.")
existing = await self._tracks.get_by_source(source, source_id)
if existing is not None:
return existing
artist_entity = await self._artists.get_or_create(artist or _UNKNOWN_ARTIST)
return await self._tracks.add(
id=uuid.uuid4(),
title=title,
artist_id=artist_entity.id,
storage_uri=None,
file_format=None,
file_size=None,
source=source,
source_id=source_id,
metadata_status="pending",
added_by=added_by,
availability="remote",
)
async def request_materialize(
self, track_id: uuid.UUID, *, requested_by: uuid.UUID | None
) -> MaterializeOutcome:
"""Kick off (or report on) materializing a placeholder track.
Already-local tracks are a no-op (``job=None``). A track with no
remote ``source_id`` (e.g. a deleted upload row reused for something
else) can't be materialized."""
track = await self._tracks.get_by_id(track_id)
if track is None:
raise NotFoundError(f"Track {track_id} not found.")
if track.availability == "local":
return MaterializeOutcome(track=track, job=None)
if track.source_id is None:
raise ValidationError("Track has no remote source to materialize from.")
active = await self._jobs.get_active_for_source(track.source, track.source_id)
if active is not None:
return MaterializeOutcome(track=track, job=active)
job = await self._jobs.add(
source=track.source,
source_id=track.source_id,
query=None,
requested_by=requested_by,
)
await self._jobs.set_status(job.id, status="queued", track_id=track.id)
if self._enqueue_materialize is not None:
await self._enqueue_materialize(job.id)
refreshed = await self._jobs.get_by_id(job.id)
return MaterializeOutcome(track=track, job=refreshed if refreshed is not None else job)
+6 -3
View File
@@ -72,16 +72,19 @@ class StreamingService:
track = await self._tracks.get_by_id(track_id) track = await self._tracks.get_by_id(track_id)
if track is None: if track is None:
raise NotFoundError("Track not found.") raise NotFoundError("Track not found.")
storage_uri = track.storage_uri
if storage_uri is None:
raise NotFoundError("Track is not yet downloaded.")
stat = await self._storage.stat(track.storage_uri) stat = await self._storage.stat(storage_uri)
total_size = stat.size total_size = stat.size
content_type = stat.content_type or _FORMAT_CONTENT_TYPE.get( content_type = stat.content_type or _FORMAT_CONTENT_TYPE.get(
track.file_format.lower(), "application/octet-stream" (track.file_format or "").lower(), "application/octet-stream"
) )
start, end, is_partial = _parse_range(range_header, total_size) start, end, is_partial = _parse_range(range_header, total_size)
stream, _ = await self._storage.open_range(track.storage_uri, start, end) stream, _ = await self._storage.open_range(storage_uri, start, end)
actual_end = end if end is not None else total_size - 1 actual_end = end if end is not None else total_size - 1
content_length = actual_end - start + 1 content_length = actual_end - start + 1
+46 -1
View File
@@ -5,12 +5,25 @@ development). Access the cached singleton via :func:`get_settings`.
""" """
from functools import lru_cache from functools import lru_cache
from importlib.metadata import PackageNotFoundError, version
from pathlib import Path from pathlib import Path
from typing import Literal from typing import Literal
from pydantic import Field, SecretStr, field_validator from pydantic import Field, SecretStr, field_validator
from pydantic_settings import BaseSettings, SettingsConfigDict from pydantic_settings import BaseSettings, SettingsConfigDict
# App identity for outbound API calls (e.g. the MusicBrainz/AcoustID
# User-Agent). Name is fixed; version comes from the installed package.
APP_NAME = "MCMA"
_PROJECT_URL = "https://git.ollyhearn.ru/olly/mcma-backend"
def app_version() -> str:
try:
return version("mcma-backend")
except PackageNotFoundError:
return "0.0.0"
class Settings(BaseSettings): class Settings(BaseSettings):
model_config = SettingsConfigDict( model_config = SettingsConfigDict(
@@ -58,6 +71,9 @@ class Settings(BaseSettings):
media_path: Path = Path("/data/media") media_path: Path = Path("/data/media")
transcode_cache_path: Path = Path("/data/transcode-cache") transcode_cache_path: Path = Path("/data/transcode-cache")
max_parallel_downloads: int = 2 max_parallel_downloads: int = 2
# How many times the download worker retries a failed fetch (yt-dlp fails
# often) before marking the job ``failed`` — exponential backoff between tries.
download_max_retries: int = 3
storage_backend: Literal["local", "s3"] = "local" storage_backend: Literal["local", "s3"] = "local"
upload_tmp_dir: Path | None = None upload_tmp_dir: Path | None = None
@@ -77,7 +93,21 @@ class Settings(BaseSettings):
ml_service_url: str | None = None ml_service_url: str | None = None
acoustid_api_key: SecretStr | None = None acoustid_api_key: SecretStr | None = None
acoustid_api_url: str = "https://api.acoustid.org/v2/lookup" acoustid_api_url: str = "https://api.acoustid.org/v2/lookup"
musicbrainz_user_agent: str = "mcma-backend/0.1.0 ( https://github.com/your/repo )" # Above this AcoustID match score, trust the acoustic identification over
# embedded file tags (which are frequently junk on downloaded files —
# e.g. "Music Track" / "Sound_12345"). Below it, keep the tag-first merge.
acoustid_trust_score: float = 0.85
# MusicBrainz/AcoustID require a meaningful User-Agent identifying the
# application and a way to contact its maintainer (see
# https://musicbrainz.org/doc/XML_Web_Service/Rate_Limiting). Self-hosted
# deployments should set their own contact email; see
# ``musicbrainz_user_agent`` below for how it's used.
musicbrainz_owner_email: str | None = None
# ``youtube`` fetch source (search + download via ytmusicapi/yt-dlp). Enabled
# by default; the source still reports unavailable if the libs aren't present.
youtube_enabled: bool = True
# Optional cookies file (Netscape format) for yt-dlp — lets it fetch
# age-restricted / region-locked items via an authenticated session.
youtube_cookies_path: Path | None = None youtube_cookies_path: Path | None = None
# -- enrichment ------------------------------------------------------- # -- enrichment -------------------------------------------------------
@@ -85,6 +115,11 @@ class Settings(BaseSettings):
# image installs it via libchromaprint-tools. # image installs it via libchromaprint-tools.
fpcalc_path: str = "fpcalc" fpcalc_path: str = "fpcalc"
# Cover Art Archive — network fallback for album covers (after embedded art).
# Disable to keep enrichment fully offline; embedded artwork still works.
coverart_enabled: bool = True
coverart_base_url: str = "https://coverartarchive.org"
@field_validator("database_url") @field_validator("database_url")
@classmethod @classmethod
def _require_async_driver(cls, v: str) -> str: def _require_async_driver(cls, v: str) -> str:
@@ -96,6 +131,16 @@ class Settings(BaseSettings):
def is_prod(self) -> bool: def is_prod(self) -> bool:
return self.environment == "prod" return self.environment == "prod"
@property
def musicbrainz_user_agent(self) -> str:
"""User-Agent sent to MusicBrainz/AcoustID: ``MCMA/<version> ( <contact> )``.
Falls back to the project URL if the deployment hasn't set
``musicbrainz_owner_email``.
"""
contact = self.musicbrainz_owner_email or _PROJECT_URL
return f"{APP_NAME}/{app_version()} ( {contact} )"
@lru_cache @lru_cache
def get_settings() -> Settings: def get_settings() -> Settings:
+13 -1
View File
@@ -1,11 +1,18 @@
"""Domain entities and value objects — pure, framework-free.""" """Domain entities and value objects — pure, framework-free."""
from app.domain.entities.album import Album from app.domain.entities.album import Album
from app.domain.entities.cover import CoverArt
from app.domain.entities.download import DownloadJob
from app.domain.entities.history import PlayHistoryEntry from app.domain.entities.history import PlayHistoryEntry
from app.domain.entities.like import Like from app.domain.entities.like import Like
from app.domain.entities.metadata import AudioTags, Fingerprint, RecordingMatch from app.domain.entities.metadata import AudioTags, Fingerprint, RecordingMatch
from app.domain.entities.playlist import Playlist from app.domain.entities.playlist import Playlist
from app.domain.entities.storage import ObjectStat from app.domain.entities.storage import (
DiskUsage,
FormatBreakdown,
LibraryStats,
ObjectStat,
)
from app.domain.entities.track import Artist, Track from app.domain.entities.track import Artist, Track
from app.domain.entities.user import Credentials, SubsonicCredentials, User from app.domain.entities.user import Credentials, SubsonicCredentials, User
@@ -13,8 +20,13 @@ __all__ = [
"Album", "Album",
"Artist", "Artist",
"AudioTags", "AudioTags",
"CoverArt",
"Credentials", "Credentials",
"DiskUsage",
"DownloadJob",
"Fingerprint", "Fingerprint",
"FormatBreakdown",
"LibraryStats",
"Like", "Like",
"ObjectStat", "ObjectStat",
"PlayHistoryEntry", "PlayHistoryEntry",
+2
View File
@@ -13,5 +13,7 @@ class Album:
year: int | None year: int | None
cover_path: str | None cover_path: str | None
musicbrainz_id: str | None musicbrainz_id: str | None
source: str | None
source_id: str | None
created_at: dt.datetime created_at: dt.datetime
updated_at: dt.datetime updated_at: dt.datetime
+28
View File
@@ -0,0 +1,28 @@
"""Cover-art value object — raw image bytes plus their MIME type.
Crosses the domain boundary between the cover sources (embedded extractor,
Cover Art Archive) and the storage/serving layers. The bytes are the encoded
image as-is; we never decode/resize in Phase 1.
"""
from dataclasses import dataclass
@dataclass(frozen=True, slots=True)
class CoverArt:
data: bytes
content_type: str # "image/jpeg" | "image/png" | …
@property
def extension(self) -> str:
"""File extension for the content type (no leading dot)."""
return _EXT_BY_TYPE.get(self.content_type.lower(), "jpg")
_EXT_BY_TYPE: dict[str, str] = {
"image/jpeg": "jpg",
"image/jpg": "jpg",
"image/png": "png",
"image/webp": "webp",
"image/gif": "gif",
}
+26
View File
@@ -0,0 +1,26 @@
"""Download job domain entity (plan §6.1).
A queued fetch from an external source, tracked through its lifecycle so the UI
download manager (screen §A5) can show progress, errors, and retries. The
``status`` strings mirror :class:`~app.infrastructure.db.models.enums.DownloadStatus`.
"""
import datetime as dt
import uuid
from dataclasses import dataclass
@dataclass(frozen=True, slots=True)
class DownloadJob:
id: uuid.UUID
source: str
source_id: str | None
query: str | None
requested_by: uuid.UUID | None
status: str
progress: float
error_message: str | None
retry_count: int
track_id: uuid.UUID | None
created_at: dt.datetime
updated_at: dt.datetime
+1
View File
@@ -47,6 +47,7 @@ class RecordingMatch:
acoustid: str acoustid: str
score: float score: float
recording_mbid: str | None = None recording_mbid: str | None = None
release_group_mbid: str | None = None
title: str | None = None title: str | None = None
artist: str | None = None artist: str | None = None
album: str | None = None album: str | None = None
+37
View File
@@ -1,5 +1,6 @@
"""Value objects for file storage.""" """Value objects for file storage."""
import datetime as dt
from dataclasses import dataclass from dataclasses import dataclass
@@ -7,3 +8,39 @@ from dataclasses import dataclass
class ObjectStat: class ObjectStat:
size: int size: int
content_type: str | None content_type: str | None
@dataclass(frozen=True, slots=True)
class DiskUsage:
"""Capacity of the volume backing the media store. ``None`` for backends
(e.g. object stores) that expose no notion of total disk capacity."""
total: int
used: int
free: int
@dataclass(frozen=True, slots=True)
class FormatBreakdown:
"""Per-container-format slice of the library (e.g. ``flac`` → 312 tracks)."""
file_format: str
track_count: int
total_size: int
@dataclass(frozen=True, slots=True)
class LibraryStats:
"""Aggregate facts about everything the instance has stored. Computed from
the catalogue (DB), not the filesystem — ``total_size`` is the sum of the
recorded ``file_size`` of every track."""
total_tracks: int
total_size: int
total_duration_seconds: int
by_format: list[FormatBreakdown]
by_metadata_status: dict[str, int]
by_source: dict[str, int]
largest_track_size: int
earliest_added: dt.datetime | None
latest_added: dt.datetime | None
+9 -3
View File
@@ -9,6 +9,8 @@ from dataclasses import dataclass
class Artist: class Artist:
id: uuid.UUID id: uuid.UUID
name: str name: str
source: str | None
source_id: str | None
created_at: dt.datetime created_at: dt.datetime
updated_at: dt.datetime updated_at: dt.datetime
@@ -19,14 +21,18 @@ class Track:
title: str title: str
artist_id: uuid.UUID artist_id: uuid.UUID
album_id: uuid.UUID | None album_id: uuid.UUID | None
storage_uri: str storage_uri: str | None
file_format: str file_format: str | None
file_size: int file_size: int | None
source: str source: str
source_id: str source_id: str
duration_seconds: int | None duration_seconds: int | None
genre: str | None genre: str | None
year: int | None year: int | None
track_number: int | None
metadata_status: str metadata_status: str
metadata_error: str | None
enriched_at: dt.datetime | None
availability: str
created_at: dt.datetime created_at: dt.datetime
updated_at: dt.datetime updated_at: dt.datetime
+158 -10
View File
@@ -7,7 +7,7 @@ are bound to these ports at the composition root (``app.api.deps``).
import datetime as dt import datetime as dt
import uuid import uuid
from collections.abc import AsyncIterator, Iterator from collections.abc import AsyncIterator, Awaitable, Callable, Iterator
from contextlib import AbstractAsyncContextManager from contextlib import AbstractAsyncContextManager
from pathlib import Path from pathlib import Path
from typing import Protocol from typing import Protocol
@@ -15,8 +15,12 @@ from typing import Protocol
from app.domain.entities import ( from app.domain.entities import (
Album, Album,
AudioTags, AudioTags,
CoverArt,
Credentials, Credentials,
DiskUsage,
DownloadJob,
Fingerprint, Fingerprint,
LibraryStats,
Like, Like,
ObjectStat, ObjectStat,
PlayHistoryEntry, PlayHistoryEntry,
@@ -26,9 +30,14 @@ from app.domain.entities import (
User, User,
) )
from app.domain.entities.track import Artist, Track from app.domain.entities.track import Artist, Track
from app.domain.sources import SourceFile, SourceInfo from app.domain.sources import DownloadResult, RawMetadata, SearchResult, SourceFile, SourceInfo
from app.domain.tokens import IssuedToken, TokenClaims, TokenType from app.domain.tokens import IssuedToken, TokenClaims, TokenType
# A fetch source reports download progress as a fraction in [0.0, 1.0]. It's a
# plain callback (not a port) because it's an inversion of control supplied per
# call by the worker, which persists it to the download job.
ProgressCallback = Callable[[float], Awaitable[None]]
class UserRepository(Protocol): class UserRepository(Protocol):
async def get_by_id(self, user_id: uuid.UUID) -> User | None: ... async def get_by_id(self, user_id: uuid.UUID) -> User | None: ...
@@ -97,10 +106,19 @@ class FileStorage(Protocol):
async def exists(self, key: str) -> bool: ... async def exists(self, key: str) -> bool: ...
async def delete(self, key: str) -> None: ... async def delete(self, key: str) -> None: ...
def as_local_path(self, key: str) -> AbstractAsyncContextManager[Path]: ... def as_local_path(self, key: str) -> AbstractAsyncContextManager[Path]: ...
async def disk_usage(self) -> DiskUsage | None:
"""Capacity of the volume backing the store, or ``None`` when the
backend has no addressable disk (e.g. an object store)."""
...
class ArtistRepository(Protocol): class ArtistRepository(Protocol):
async def get_or_create(self, name: str) -> Artist: ... async def get_or_create(self, name: str) -> Artist: ...
async def get_or_create_remote(self, *, name: str, source: str, source_id: str) -> Artist:
"""Resolve/create an artist bound to a remote ``(source, source_id)``
(lazy materialization save-to-library)."""
...
async def get_by_id(self, artist_id: uuid.UUID) -> Artist | None: ... async def get_by_id(self, artist_id: uuid.UUID) -> Artist | None: ...
async def get_many(self, ids: list[uuid.UUID]) -> list[Artist]: ... async def get_many(self, ids: list[uuid.UUID]) -> list[Artist]: ...
async def list(self, *, q: str | None, limit: int, offset: int) -> list[Artist]: ... async def list(self, *, q: str | None, limit: int, offset: int) -> list[Artist]: ...
@@ -118,24 +136,41 @@ class TrackRepository(Protocol):
id: uuid.UUID, id: uuid.UUID,
title: str, title: str,
artist_id: uuid.UUID, artist_id: uuid.UUID,
storage_uri: str, storage_uri: str | None,
file_format: str, file_format: str | None,
file_size: int, file_size: int | None,
source: str, source: str,
source_id: str, source_id: str,
metadata_status: str, metadata_status: str,
added_by: uuid.UUID | None, added_by: uuid.UUID | None,
availability: str = ...,
) -> Track: ... ) -> Track: ...
async def materialize(
self,
track_id: uuid.UUID,
*,
storage_uri: str,
file_format: str,
file_size: int,
bitrate: int | None,
) -> Track:
"""Fill in a remote placeholder's audio fields after a download
(lazy materialization), flipping ``availability`` to ``local``."""
...
async def delete(self, track_id: uuid.UUID) -> None: ... async def delete(self, track_id: uuid.UUID) -> None: ...
# genres must come before ``list`` — the method named ``list`` shadows the # genres / library_stats must come before ``list`` — the method named
# builtin in later annotations (same pattern as AlbumRepository below). # ``list`` shadows the builtin in later annotations (same pattern as
# AlbumRepository below).
async def genres(self) -> list[tuple[str, int]]: ... async def genres(self) -> list[tuple[str, int]]: ...
async def library_stats(self) -> LibraryStats: ...
async def list( async def list(
self, self,
*, *,
artist_id: uuid.UUID | None, artist_id: uuid.UUID | None,
album_id: uuid.UUID | None, album_id: uuid.UUID | None,
q: str | None, q: str | None,
source: str | None = None,
sort_by: str, sort_by: str,
order: str, order: str,
limit: int, limit: int,
@@ -147,6 +182,7 @@ class TrackRepository(Protocol):
artist_id: uuid.UUID | None, artist_id: uuid.UUID | None,
album_id: uuid.UUID | None, album_id: uuid.UUID | None,
q: str | None, q: str | None,
source: str | None = None,
) -> int: ... ) -> int: ...
async def update( async def update(
self, self,
@@ -171,11 +207,18 @@ class TrackRepository(Protocol):
acoustid_fingerprint: str | None, acoustid_fingerprint: str | None,
musicbrainz_id: str | None, musicbrainz_id: str | None,
metadata_status: str, metadata_status: str,
metadata_error: str | None = None,
) -> Track: ) -> Track:
"""Persist auto-enrichment results. Nullable fields are filled only when """Persist auto-enrichment results. Nullable fields are filled only when
a non-``None`` value is supplied (re-enrich never erases prior data); a non-``None`` value is supplied (re-enrich never erases prior data);
``title``/``artist_id``/``metadata_status`` are always written. Callers ``title``/``artist_id``/``metadata_status`` are always written, and the
must not invoke this for ``metadata_status == 'manual'`` tracks.""" run's outcome (``metadata_error`` + completion time) is always stamped.
Callers must not invoke this for ``metadata_status == 'manual'`` tracks."""
...
async def mark_enrichment_failed(self, track_id: uuid.UUID, *, error: str) -> None:
"""Record that an enrichment run crashed unexpectedly: set ``failed`` +
the error reason. A no-op for ``manual`` or missing tracks."""
... ...
@@ -188,6 +231,21 @@ class AlbumRepository(Protocol):
year: int | None, year: int | None,
musicbrainz_id: str | None, musicbrainz_id: str | None,
) -> Album: ... ) -> Album: ...
async def get_or_create_remote(
self,
*,
title: str,
artist_id: uuid.UUID,
year: int | None,
musicbrainz_id: str | None,
source: str,
source_id: str,
) -> Album:
"""Resolve/create an album bound to a remote ``(source, source_id)``
(lazy materialization save-to-library)."""
...
async def set_cover_path(self, album_id: uuid.UUID, cover_path: str) -> None: ...
async def get_by_id(self, album_id: uuid.UUID) -> Album | None: ... async def get_by_id(self, album_id: uuid.UUID) -> Album | None: ...
async def get_many(self, ids: list[uuid.UUID]) -> list[Album]: ... async def get_many(self, ids: list[uuid.UUID]) -> list[Album]: ...
async def count(self, *, artist_id: uuid.UUID | None, q: str | None) -> int: ... async def count(self, *, artist_id: uuid.UUID | None, q: str | None) -> int: ...
@@ -256,6 +314,54 @@ class HistoryRepository(Protocol):
async def count(self, *, user_id: uuid.UUID) -> int: ... async def count(self, *, user_id: uuid.UUID) -> int: ...
class DownloadJobRepository(Protocol):
"""Persistence for download jobs (plan §6.1). Drives the §A5 download manager
and the worker's retry/backoff loop."""
async def add(
self,
*,
source: str,
source_id: str | None,
query: str | None,
requested_by: uuid.UUID | None,
) -> DownloadJob: ...
async def get_by_id(self, job_id: uuid.UUID) -> DownloadJob | None: ...
async def get_active_for_source(self, source: str, source_id: str) -> DownloadJob | None:
"""An unfinished (queued/downloading/enriching) job for the same item, if
any — used to dedup before enqueuing so a double-click can't queue twice."""
...
async def list(
self,
*,
requested_by: uuid.UUID | None,
status: str | None,
limit: int,
offset: int,
) -> list[DownloadJob]: ...
async def count(self, *, requested_by: uuid.UUID | None, status: str | None) -> int: ...
async def set_status(
self,
job_id: uuid.UUID,
*,
status: str,
error_message: str | None = None,
track_id: uuid.UUID | None = None,
) -> None: ...
async def set_progress(self, job_id: uuid.UUID, progress: float) -> None: ...
async def increment_retry(self, job_id: uuid.UUID) -> int:
"""Bump ``retry_count`` and return the new value."""
...
async def delete(self, job_id: uuid.UUID) -> None: ...
async def failure_rate(self, source: str, *, since: dt.datetime) -> float:
"""Fraction of jobs for ``source`` created since ``since`` that ended
``failed`` (0.0 when there are none) — drives the §A5 "source unhealthy"
banner."""
...
class SourceBackend(Protocol): class SourceBackend(Protocol):
"""A registered source of tracks (mounted folder, YouTube, …). """A registered source of tracks (mounted folder, YouTube, …).
@@ -274,6 +380,29 @@ class IndexableSource(SourceBackend, Protocol):
def scan(self) -> Iterator[SourceFile]: ... def scan(self) -> Iterator[SourceFile]: ...
class SearchableSource(SourceBackend, Protocol):
"""A source that can be searched by free text (e.g. YouTube Music).
Returns ``[]`` (never raises) on no results / the service being down — the
discover screen degrades to "nothing found" rather than erroring."""
async def search(self, query: str, *, limit: int) -> list[SearchResult]: ...
class FetchableSource(SourceBackend, Protocol):
"""A source that can download a previously-discovered item to local disk.
``fetch`` resolves a ``source_id`` (from a :class:`SearchResult`) into a file
and reports progress through ``on_progress``. It runs only in a worker (heavy
I/O) and raises on failure so the download task can retry with backoff."""
async def fetch(
self, source_id: str, *, on_progress: ProgressCallback | None = None
) -> DownloadResult: ...
async def get_metadata(self, source_id: str) -> RawMetadata | None: ...
# -- metadata enrichment (plan §6.2) ----------------------------------------- # -- metadata enrichment (plan §6.2) -----------------------------------------
class AudioTagReader(Protocol): class AudioTagReader(Protocol):
"""Reads embedded tags from a local audio file. Returns ``None`` only when """Reads embedded tags from a local audio file. Returns ``None`` only when
@@ -293,7 +422,26 @@ class AudioFingerprinter(Protocol):
class AcoustIdClient(Protocol): class AcoustIdClient(Protocol):
"""AcoustID lookup. ``is_available`` is False without an API key (the whole """AcoustID lookup. ``is_available`` is False without an API key (the whole
fingerprint path is then skipped). ``lookup`` returns the best match or fingerprint path is then skipped). ``lookup`` returns the best match or
``None`` (no result / service down), never raising.""" ``None`` (no result / service down), never raising. ``lookup_all`` returns
the same candidates ranked by confidence (``[]`` on no result / unavailable
/ error), for the metadata editor's match picker."""
def is_available(self) -> bool: ... def is_available(self) -> bool: ...
async def lookup(self, fingerprint: Fingerprint) -> RecordingMatch | None: ... async def lookup(self, fingerprint: Fingerprint) -> RecordingMatch | None: ...
async def lookup_all(self, fingerprint: Fingerprint) -> list[RecordingMatch]: ...
class CoverArtExtractor(Protocol):
"""Pulls embedded cover art out of a local audio file (offline, no network).
Returns ``None`` when the file has no picture or can't be parsed — never raises."""
async def extract(self, path: Path) -> CoverArt | None: ...
class CoverArtProvider(Protocol):
"""Fetches cover art from an external service (Cover Art Archive) by
MusicBrainz release-group id. ``is_available`` may gate it off; ``fetch``
returns ``None`` (not found / service down), never raising."""
def is_available(self) -> bool: ...
async def fetch_release_group(self, release_group_mbid: str) -> CoverArt | None: ...
+58 -2
View File
@@ -10,8 +10,14 @@ here — a source yields a file plus a minimal title; enrichment (plan §6.2) fi
the rest later, so this stays a thin discovery layer (CLAUDE.md: no duplicated the rest later, so this stays a thin discovery layer (CLAUDE.md: no duplicated
business logic).""" business logic)."""
from dataclasses import dataclass from dataclasses import dataclass, field
from pathlib import Path from pathlib import Path
from typing import Any
# A source's ``kind`` describes which ports it satisfies, so the UI/admin can
# tell an indexed folder from a searchable fetch-source. A backend may be both.
KIND_INDEXABLE = "indexable" # enumerates files already on disk (local folder)
KIND_FETCH = "fetch" # searches + downloads from an external service (YTM, …)
@dataclass(frozen=True, slots=True) @dataclass(frozen=True, slots=True)
@@ -20,7 +26,7 @@ class SourceInfo:
name: str name: str
label: str label: str
kind: str # "indexable" (more kinds — search/download — arrive with youtube) kind: str # KIND_INDEXABLE | KIND_FETCH
available: bool available: bool
@@ -37,3 +43,53 @@ class SourceFile:
suggested_title: str suggested_title: str
file_format: str file_format: str
file_size: int file_size: int
@dataclass(frozen=True, slots=True)
class SearchResult:
"""One hit from a searchable source (plan §5), shown on the discover screen.
``source_id`` is the stable handle the same backend later resolves in
``fetch`` — it must round-trip a download request without re-searching.
``raw`` carries the backend's untouched payload for debugging / future use.
"""
source: str
source_id: str
title: str
artist: str | None
album: str | None
duration_seconds: int | None
thumbnail_url: str | None
raw: dict[str, Any] = field(default_factory=dict)
@dataclass(frozen=True, slots=True)
class RawMetadata:
"""Metadata a fetch-source can offer about an item *before* enrichment.
Best-effort and source-shaped — the canonical metadata still comes from the
enrichment pipeline (plan §6.2). Used to seed a more useful provisional
title than a bare id while a download is queued."""
title: str | None
artist: str | None
album: str | None
year: int | None
extra: dict[str, Any] = field(default_factory=dict)
@dataclass(frozen=True, slots=True)
class DownloadResult:
"""A file a fetch-source produced on local disk (plan §5).
``path`` is a temp file the caller owns: it is stored into managed storage
and then removed (same lifecycle as an upload). ``source_id`` is echoed back
because some backends only learn the canonical id during the download."""
source_id: str
path: Path
file_format: str
file_size: int
bitrate: int | None
suggested_title: str
+11 -1
View File
@@ -2,7 +2,7 @@
import uuid import uuid
from sqlalchemy import ForeignKey, Integer, String from sqlalchemy import ForeignKey, Integer, String, UniqueConstraint
from sqlalchemy.orm import Mapped, mapped_column from sqlalchemy.orm import Mapped, mapped_column
from app.infrastructure.db.base import Base from app.infrastructure.db.base import Base
@@ -11,6 +11,12 @@ from app.infrastructure.db.models.mixins import TimestampMixin, UUIDPrimaryKeyMi
class AlbumModel(UUIDPrimaryKeyMixin, TimestampMixin, Base): class AlbumModel(UUIDPrimaryKeyMixin, TimestampMixin, Base):
__tablename__ = "albums" __tablename__ = "albums"
__table_args__ = (
# Binds a remote (browsable) album to its local row for re-browse/save
# dedup. Multiple NULLs are allowed by Postgres, so locally-created
# albums (source/source_id both NULL) never collide on this.
UniqueConstraint("source", "source_id", name="uq_albums_source_source_id"),
)
title: Mapped[str] = mapped_column(String(1024), index=True, nullable=False) title: Mapped[str] = mapped_column(String(1024), index=True, nullable=False)
artist_id: Mapped[uuid.UUID] = mapped_column( artist_id: Mapped[uuid.UUID] = mapped_column(
@@ -21,3 +27,7 @@ class AlbumModel(UUIDPrimaryKeyMixin, TimestampMixin, Base):
year: Mapped[int | None] = mapped_column(Integer, nullable=True) year: Mapped[int | None] = mapped_column(Integer, nullable=True)
cover_path: Mapped[str | None] = mapped_column(String(1024), nullable=True) cover_path: Mapped[str | None] = mapped_column(String(1024), nullable=True)
musicbrainz_id: Mapped[str | None] = mapped_column(String(36), index=True, nullable=True) musicbrainz_id: Mapped[str | None] = mapped_column(String(36), index=True, nullable=True)
# -- remote identity (lazy materialization) --------------------------
source: Mapped[str | None] = mapped_column(String(32), nullable=True)
source_id: Mapped[str | None] = mapped_column(String(512), nullable=True)
+11 -1
View File
@@ -1,6 +1,6 @@
"""ORM model for artists.""" """ORM model for artists."""
from sqlalchemy import String from sqlalchemy import String, UniqueConstraint
from sqlalchemy.orm import Mapped, mapped_column from sqlalchemy.orm import Mapped, mapped_column
from app.infrastructure.db.base import Base from app.infrastructure.db.base import Base
@@ -9,6 +9,16 @@ from app.infrastructure.db.models.mixins import TimestampMixin, UUIDPrimaryKeyMi
class ArtistModel(UUIDPrimaryKeyMixin, TimestampMixin, Base): class ArtistModel(UUIDPrimaryKeyMixin, TimestampMixin, Base):
__tablename__ = "artists" __tablename__ = "artists"
__table_args__ = (
# Binds a remote (browsable) artist to its local row for re-browse/save
# dedup. Multiple NULLs are allowed by Postgres, so locally-created
# artists (source/source_id both NULL) never collide on this.
UniqueConstraint("source", "source_id", name="uq_artists_source_source_id"),
)
name: Mapped[str] = mapped_column(String(512), index=True, nullable=False) name: Mapped[str] = mapped_column(String(512), index=True, nullable=False)
musicbrainz_id: Mapped[str | None] = mapped_column(String(36), index=True, nullable=True) musicbrainz_id: Mapped[str | None] = mapped_column(String(36), index=True, nullable=True)
# -- remote identity (lazy materialization) --------------------------
source: Mapped[str | None] = mapped_column(String(32), nullable=True)
source_id: Mapped[str | None] = mapped_column(String(512), nullable=True)
@@ -35,3 +35,9 @@ class DownloadJobModel(UUIDPrimaryKeyMixin, TimestampMixin, Base):
progress: Mapped[float] = mapped_column(Float, nullable=False, default=0.0) progress: Mapped[float] = mapped_column(Float, nullable=False, default=0.0)
error_message: Mapped[str | None] = mapped_column(Text, nullable=True) error_message: Mapped[str | None] = mapped_column(Text, nullable=True)
retry_count: Mapped[int] = mapped_column(Integer, nullable=False, default=0) retry_count: Mapped[int] = mapped_column(Integer, nullable=False, default=0)
# Set once the download finishes and the track is imported — lets the UI
# link a completed job to its library track.
track_id: Mapped[uuid.UUID | None] = mapped_column(
ForeignKey("tracks.id", ondelete="SET NULL"),
nullable=True,
)
+9
View File
@@ -64,3 +64,12 @@ class LyricsStatus(enum.StrEnum):
FOUND = "found" FOUND = "found"
NOT_FOUND = "not_found" NOT_FOUND = "not_found"
PENDING = "pending" PENDING = "pending"
class TrackAvailability(enum.StrEnum):
"""Whether a track's audio is on local storage or still a remote placeholder
(plan: lazy materialization). ``remote`` tracks have ``storage_uri = NULL``
until ``TrackRepository.materialize`` fills it in."""
LOCAL = "local"
REMOTE = "remote"
+25 -5
View File
@@ -6,13 +6,14 @@
imports/downloads stay idempotent (plan §4, §6.1). imports/downloads stay idempotent (plan §4, §6.1).
""" """
import datetime as dt
import uuid import uuid
from sqlalchemy import ForeignKey, Integer, String, UniqueConstraint from sqlalchemy import DateTime, ForeignKey, Integer, String, UniqueConstraint
from sqlalchemy.orm import Mapped, mapped_column from sqlalchemy.orm import Mapped, mapped_column
from app.infrastructure.db.base import Base from app.infrastructure.db.base import Base
from app.infrastructure.db.models.enums import MetadataStatus, StoragePolicy from app.infrastructure.db.models.enums import MetadataStatus, StoragePolicy, TrackAvailability
from app.infrastructure.db.models.mixins import TimestampMixin, UUIDPrimaryKeyMixin from app.infrastructure.db.models.mixins import TimestampMixin, UUIDPrimaryKeyMixin
@@ -40,11 +41,20 @@ class TrackModel(UUIDPrimaryKeyMixin, TimestampMixin, Base):
year: Mapped[int | None] = mapped_column(Integer, nullable=True) year: Mapped[int | None] = mapped_column(Integer, nullable=True)
# -- file (original, stored as-is) ----------------------------------- # -- file (original, stored as-is) -----------------------------------
storage_uri: Mapped[str] = mapped_column(String(2048), nullable=False) # NULL on a remote placeholder (not yet materialized) — see ``availability``.
file_format: Mapped[str] = mapped_column(String(32), nullable=False) storage_uri: Mapped[str | None] = mapped_column(String(2048), nullable=True)
file_size: Mapped[int] = mapped_column(Integer, nullable=False) file_format: Mapped[str | None] = mapped_column(String(32), nullable=True)
file_size: Mapped[int | None] = mapped_column(Integer, nullable=True)
bitrate: Mapped[int | None] = mapped_column(Integer, nullable=True) bitrate: Mapped[int | None] = mapped_column(Integer, nullable=True)
# ``remote`` = placeholder with no local audio yet; materialize() flips this
# to ``local`` once the file is downloaded and ``storage_uri`` is filled in.
availability: Mapped[str] = mapped_column(
String(16),
nullable=False,
default=TrackAvailability.LOCAL.value,
)
# -- dedup / external ids -------------------------------------------- # -- dedup / external ids --------------------------------------------
acoustid_fingerprint: Mapped[str | None] = mapped_column(String(64), index=True, nullable=True) acoustid_fingerprint: Mapped[str | None] = mapped_column(String(64), index=True, nullable=True)
musicbrainz_id: Mapped[str | None] = mapped_column(String(36), index=True, nullable=True) musicbrainz_id: Mapped[str | None] = mapped_column(String(36), index=True, nullable=True)
@@ -63,6 +73,16 @@ class TrackModel(UUIDPrimaryKeyMixin, TimestampMixin, Base):
nullable=False, nullable=False,
default=MetadataStatus.PENDING.value, default=MetadataStatus.PENDING.value,
) )
# Human-readable reason the last enrichment run set ``failed`` (no match, or
# an unexpected worker error). ``None`` once a run succeeds. Surfaced in the
# UI so a stuck/failed track is diagnosable, not silent.
metadata_error: Mapped[str | None] = mapped_column(String(2048), nullable=True)
# When the last enrichment run finished (success or failure). ``None`` while
# still ``pending`` — lets the UI distinguish "queued/running" from "done".
enriched_at: Mapped[dt.datetime | None] = mapped_column(
DateTime(timezone=True),
nullable=True,
)
added_by: Mapped[uuid.UUID | None] = mapped_column( added_by: Mapped[uuid.UUID | None] = mapped_column(
ForeignKey("users.id", ondelete="SET NULL"), ForeignKey("users.id", ondelete="SET NULL"),
@@ -2,6 +2,9 @@
from app.infrastructure.db.repositories.album_repository import SqlAlchemyAlbumRepository from app.infrastructure.db.repositories.album_repository import SqlAlchemyAlbumRepository
from app.infrastructure.db.repositories.artist_repository import SqlAlchemyArtistRepository from app.infrastructure.db.repositories.artist_repository import SqlAlchemyArtistRepository
from app.infrastructure.db.repositories.download_job_repository import (
SqlAlchemyDownloadJobRepository,
)
from app.infrastructure.db.repositories.history_repository import SqlAlchemyHistoryRepository from app.infrastructure.db.repositories.history_repository import SqlAlchemyHistoryRepository
from app.infrastructure.db.repositories.like_repository import SqlAlchemyLikeRepository from app.infrastructure.db.repositories.like_repository import SqlAlchemyLikeRepository
from app.infrastructure.db.repositories.playlist_repository import SqlAlchemyPlaylistRepository from app.infrastructure.db.repositories.playlist_repository import SqlAlchemyPlaylistRepository
@@ -14,6 +17,7 @@ from app.infrastructure.db.repositories.user_repository import SqlAlchemyUserRep
__all__ = [ __all__ = [
"SqlAlchemyAlbumRepository", "SqlAlchemyAlbumRepository",
"SqlAlchemyArtistRepository", "SqlAlchemyArtistRepository",
"SqlAlchemyDownloadJobRepository",
"SqlAlchemyHistoryRepository", "SqlAlchemyHistoryRepository",
"SqlAlchemyLikeRepository", "SqlAlchemyLikeRepository",
"SqlAlchemyPlaylistRepository", "SqlAlchemyPlaylistRepository",
@@ -18,6 +18,8 @@ def _to_entity(row: AlbumModel) -> Album:
year=row.year, year=row.year,
cover_path=row.cover_path, cover_path=row.cover_path,
musicbrainz_id=row.musicbrainz_id, musicbrainz_id=row.musicbrainz_id,
source=row.source,
source_id=row.source_id,
created_at=row.created_at, created_at=row.created_at,
updated_at=row.updated_at, updated_at=row.updated_at,
) )
@@ -63,6 +65,64 @@ class SqlAlchemyAlbumRepository:
await self._session.refresh(row) await self._session.refresh(row)
return _to_entity(row) return _to_entity(row)
async def get_or_create_remote(
self,
*,
title: str,
artist_id: uuid.UUID,
year: int | None,
musicbrainz_id: str | None,
source: str,
source_id: str,
) -> Album:
"""Resolve an album by ``(source, source_id)`` first (re-browse/save
dedup), falling back to ``(title, artist_id)`` and gap-filling the
remote ids onto an existing row, else creating a new remote-bound row."""
row = (
await self._session.execute(
select(AlbumModel).where(
AlbumModel.source == source,
AlbumModel.source_id == source_id,
)
)
).scalar_one_or_none()
if row is None:
row = (
await self._session.execute(
select(AlbumModel).where(
AlbumModel.title == title,
AlbumModel.artist_id == artist_id,
)
)
).scalar_one_or_none()
if row is None:
row = AlbumModel(
title=title,
artist_id=artist_id,
year=year,
musicbrainz_id=musicbrainz_id,
source=source,
source_id=source_id,
)
self._session.add(row)
else:
if row.year is None and year is not None:
row.year = year
if row.musicbrainz_id is None and musicbrainz_id is not None:
row.musicbrainz_id = musicbrainz_id
if row.source is None and row.source_id is None:
row.source = source
row.source_id = source_id
await self._session.flush()
await self._session.refresh(row)
return _to_entity(row)
async def set_cover_path(self, album_id: uuid.UUID, cover_path: str) -> None:
row = await self._session.get(AlbumModel, album_id)
if row is not None:
row.cover_path = cover_path
await self._session.flush()
async def get_by_id(self, album_id: uuid.UUID) -> Album | None: async def get_by_id(self, album_id: uuid.UUID) -> Album | None:
row = await self._session.get(AlbumModel, album_id) row = await self._session.get(AlbumModel, album_id)
return _to_entity(row) if row is not None else None return _to_entity(row) if row is not None else None
@@ -15,6 +15,8 @@ def _to_entity(row: ArtistModel) -> Artist:
return Artist( return Artist(
id=row.id, id=row.id,
name=row.name, name=row.name,
source=row.source,
source_id=row.source_id,
created_at=row.created_at, created_at=row.created_at,
updated_at=row.updated_at, updated_at=row.updated_at,
) )
@@ -35,6 +37,32 @@ class SqlAlchemyArtistRepository:
await self._session.refresh(row) await self._session.refresh(row)
return _to_entity(row) return _to_entity(row)
async def get_or_create_remote(self, *, name: str, source: str, source_id: str) -> Artist:
"""Resolve an artist by ``(source, source_id)`` first (re-browse/save
dedup), falling back to ``name`` and gap-filling the remote ids onto an
existing row, else creating a new remote-bound row."""
row = (
await self._session.execute(
select(ArtistModel).where(
ArtistModel.source == source,
ArtistModel.source_id == source_id,
)
)
).scalar_one_or_none()
if row is None:
row = (
await self._session.execute(select(ArtistModel).where(ArtistModel.name == name))
).scalar_one_or_none()
if row is None:
row = ArtistModel(name=name, source=source, source_id=source_id)
self._session.add(row)
elif row.source is None and row.source_id is None:
row.source = source
row.source_id = source_id
await self._session.flush()
await self._session.refresh(row)
return _to_entity(row)
async def get_by_id(self, artist_id: uuid.UUID) -> Artist | None: async def get_by_id(self, artist_id: uuid.UUID) -> Artist | None:
row = await self._session.get(ArtistModel, artist_id) row = await self._session.get(ArtistModel, artist_id)
return _to_entity(row) if row is not None else None return _to_entity(row) if row is not None else None
@@ -0,0 +1,164 @@
"""Download job repository — adapter over ``AsyncSession`` (plan §6.1)."""
import datetime as dt
import uuid
from sqlalchemy import func, select
from sqlalchemy.ext.asyncio import AsyncSession
from app.domain.entities.download import DownloadJob
from app.infrastructure.db.models.download_job import DownloadJobModel
from app.infrastructure.db.models.enums import DownloadStatus
# Jobs that are not yet finished — used to dedup an in-flight download.
_ACTIVE_STATUSES = (
DownloadStatus.QUEUED.value,
DownloadStatus.DOWNLOADING.value,
DownloadStatus.ENRICHING.value,
)
def _to_entity(row: DownloadJobModel) -> DownloadJob:
return DownloadJob(
id=row.id,
source=row.source,
source_id=row.source_id,
query=row.query,
requested_by=row.requested_by,
status=row.status,
progress=row.progress,
error_message=row.error_message,
retry_count=row.retry_count,
track_id=row.track_id,
created_at=row.created_at,
updated_at=row.updated_at,
)
class SqlAlchemyDownloadJobRepository:
def __init__(self, session: AsyncSession) -> None:
self._session = session
async def add(
self,
*,
source: str,
source_id: str | None,
query: str | None,
requested_by: uuid.UUID | None,
) -> DownloadJob:
row = DownloadJobModel(
source=source,
source_id=source_id,
query=query,
requested_by=requested_by,
status=DownloadStatus.QUEUED.value,
progress=0.0,
retry_count=0,
)
self._session.add(row)
await self._session.flush()
await self._session.refresh(row)
return _to_entity(row)
async def get_by_id(self, job_id: uuid.UUID) -> DownloadJob | None:
row = await self._session.get(DownloadJobModel, job_id)
return _to_entity(row) if row is not None else None
async def get_active_for_source(self, source: str, source_id: str) -> DownloadJob | None:
row = (
await self._session.execute(
select(DownloadJobModel)
.where(
DownloadJobModel.source == source,
DownloadJobModel.source_id == source_id,
DownloadJobModel.status.in_(_ACTIVE_STATUSES),
)
.order_by(DownloadJobModel.created_at.desc())
.limit(1)
)
).scalar_one_or_none()
return _to_entity(row) if row is not None else None
async def list(
self,
*,
requested_by: uuid.UUID | None,
status: str | None,
limit: int,
offset: int,
) -> list[DownloadJob]:
stmt = select(DownloadJobModel)
if requested_by is not None:
stmt = stmt.where(DownloadJobModel.requested_by == requested_by)
if status is not None:
stmt = stmt.where(DownloadJobModel.status == status)
stmt = stmt.order_by(DownloadJobModel.created_at.desc()).limit(limit).offset(offset)
rows = (await self._session.execute(stmt)).scalars().all()
return [_to_entity(r) for r in rows]
async def count(self, *, requested_by: uuid.UUID | None, status: str | None) -> int:
stmt = select(func.count()).select_from(DownloadJobModel)
if requested_by is not None:
stmt = stmt.where(DownloadJobModel.requested_by == requested_by)
if status is not None:
stmt = stmt.where(DownloadJobModel.status == status)
return (await self._session.execute(stmt)).scalar_one()
async def set_status(
self,
job_id: uuid.UUID,
*,
status: str,
error_message: str | None = None,
track_id: uuid.UUID | None = None,
) -> None:
row = await self._session.get(DownloadJobModel, job_id)
if row is None:
return
row.status = status
# ``error_message`` is always written: a successful transition clears a
# stale reason from an earlier failed attempt.
row.error_message = error_message
if track_id is not None:
row.track_id = track_id
if status == DownloadStatus.DONE.value:
row.progress = 1.0
await self._session.flush()
async def set_progress(self, job_id: uuid.UUID, progress: float) -> None:
row = await self._session.get(DownloadJobModel, job_id)
if row is None:
return
row.progress = max(0.0, min(1.0, progress))
await self._session.flush()
async def increment_retry(self, job_id: uuid.UUID) -> int:
row = await self._session.get(DownloadJobModel, job_id)
if row is None:
return 0
row.retry_count += 1
await self._session.flush()
return row.retry_count
async def delete(self, job_id: uuid.UUID) -> None:
row = await self._session.get(DownloadJobModel, job_id)
if row is not None:
await self._session.delete(row)
await self._session.flush()
async def failure_rate(self, source: str, *, since: dt.datetime) -> float:
total, failed = (
await self._session.execute(
select(
func.count(),
func.count().filter(DownloadJobModel.status == DownloadStatus.FAILED.value),
)
.select_from(DownloadJobModel)
.where(
DownloadJobModel.source == source,
DownloadJobModel.created_at >= since,
)
)
).one()
return (failed / total) if total else 0.0
@@ -38,7 +38,11 @@ def _track_to_entity(row: TrackModel) -> Track:
duration_seconds=row.duration_seconds, duration_seconds=row.duration_seconds,
genre=row.genre, genre=row.genre,
year=row.year, year=row.year,
track_number=row.track_number,
metadata_status=row.metadata_status, metadata_status=row.metadata_status,
metadata_error=row.metadata_error,
enriched_at=row.enriched_at,
availability=row.availability,
created_at=row.created_at, created_at=row.created_at,
updated_at=row.updated_at, updated_at=row.updated_at,
) )
@@ -37,7 +37,11 @@ def _track_to_entity(row: TrackModel) -> Track:
duration_seconds=row.duration_seconds, duration_seconds=row.duration_seconds,
genre=row.genre, genre=row.genre,
year=row.year, year=row.year,
track_number=row.track_number,
metadata_status=row.metadata_status, metadata_status=row.metadata_status,
metadata_error=row.metadata_error,
enriched_at=row.enriched_at,
availability=row.availability,
created_at=row.created_at, created_at=row.created_at,
updated_at=row.updated_at, updated_at=row.updated_at,
) )
@@ -1,13 +1,16 @@
"""Track repository — adapter over ``AsyncSession``.""" """Track repository — adapter over ``AsyncSession``."""
import datetime as dt
import uuid import uuid
from sqlalchemy import func, select from sqlalchemy import func, select
from sqlalchemy.ext.asyncio import AsyncSession from sqlalchemy.ext.asyncio import AsyncSession
from app.domain.entities.storage import FormatBreakdown, LibraryStats
from app.domain.entities.track import Track from app.domain.entities.track import Track
from app.domain.errors import NotFoundError from app.domain.errors import NotFoundError
from app.infrastructure.db.models.artist import ArtistModel from app.infrastructure.db.models.artist import ArtistModel
from app.infrastructure.db.models.enums import TrackAvailability
from app.infrastructure.db.models.track import TrackModel from app.infrastructure.db.models.track import TrackModel
@@ -25,7 +28,11 @@ def _to_entity(row: TrackModel) -> Track:
duration_seconds=row.duration_seconds, duration_seconds=row.duration_seconds,
genre=row.genre, genre=row.genre,
year=row.year, year=row.year,
track_number=row.track_number,
metadata_status=row.metadata_status, metadata_status=row.metadata_status,
metadata_error=row.metadata_error,
enriched_at=row.enriched_at,
availability=row.availability,
created_at=row.created_at, created_at=row.created_at,
updated_at=row.updated_at, updated_at=row.updated_at,
) )
@@ -56,13 +63,14 @@ class SqlAlchemyTrackRepository:
id: uuid.UUID, id: uuid.UUID,
title: str, title: str,
artist_id: uuid.UUID, artist_id: uuid.UUID,
storage_uri: str, storage_uri: str | None,
file_format: str, file_format: str | None,
file_size: int, file_size: int | None,
source: str, source: str,
source_id: str, source_id: str,
metadata_status: str, metadata_status: str,
added_by: uuid.UUID | None, added_by: uuid.UUID | None,
availability: str = TrackAvailability.LOCAL.value,
) -> Track: ) -> Track:
row = TrackModel( row = TrackModel(
id=id, id=id,
@@ -75,12 +83,38 @@ class SqlAlchemyTrackRepository:
source_id=source_id, source_id=source_id,
metadata_status=metadata_status, metadata_status=metadata_status,
added_by=added_by, added_by=added_by,
availability=availability,
) )
self._session.add(row) self._session.add(row)
await self._session.flush() await self._session.flush()
await self._session.refresh(row) await self._session.refresh(row)
return _to_entity(row) return _to_entity(row)
async def materialize(
self,
track_id: uuid.UUID,
*,
storage_uri: str,
file_format: str,
file_size: int,
bitrate: int | None,
) -> Track:
"""Fill in a remote placeholder's audio fields after a download (lazy
materialization). ``track.id`` is unchanged, so likes/playlists/queue
entries that already reference it keep working."""
row = await self._session.get(TrackModel, track_id)
if row is None:
raise NotFoundError(f"Track {track_id} not found.")
row.storage_uri = storage_uri
row.file_format = file_format
row.file_size = file_size
if bitrate is not None:
row.bitrate = bitrate
row.availability = TrackAvailability.LOCAL.value
await self._session.flush()
await self._session.refresh(row)
return _to_entity(row)
async def delete(self, track_id: uuid.UUID) -> None: async def delete(self, track_id: uuid.UUID) -> None:
row = await self._session.get(TrackModel, track_id) row = await self._session.get(TrackModel, track_id)
if row is not None: if row is not None:
@@ -102,12 +136,71 @@ class SqlAlchemyTrackRepository:
).all() ).all()
return [(row.genre, row.cnt) for row in rows] return [(row.genre, row.cnt) for row in rows]
async def library_stats(self) -> LibraryStats:
"""One-shot aggregate over the whole catalogue (no pagination). Defined
before ``list`` for the same shadowing reason as ``genres``."""
totals = (
await self._session.execute(
select(
func.count(TrackModel.id),
func.coalesce(func.sum(TrackModel.file_size), 0),
func.coalesce(func.sum(TrackModel.duration_seconds), 0),
func.coalesce(func.max(TrackModel.file_size), 0),
func.min(TrackModel.created_at),
func.max(TrackModel.created_at),
)
)
).one()
fmt_rows = (
await self._session.execute(
select(
TrackModel.file_format,
func.count(TrackModel.id),
func.coalesce(func.sum(TrackModel.file_size), 0),
)
.where(TrackModel.file_format.is_not(None))
.group_by(TrackModel.file_format)
.order_by(func.sum(TrackModel.file_size).desc())
)
).all()
status_rows = (
await self._session.execute(
select(TrackModel.metadata_status, func.count(TrackModel.id)).group_by(
TrackModel.metadata_status
)
)
).all()
source_rows = (
await self._session.execute(
select(TrackModel.source, func.count(TrackModel.id)).group_by(TrackModel.source)
)
).all()
return LibraryStats(
total_tracks=totals[0],
total_size=totals[1],
total_duration_seconds=totals[2],
largest_track_size=totals[3],
earliest_added=totals[4],
latest_added=totals[5],
by_format=[
FormatBreakdown(file_format=fmt, track_count=cnt, total_size=size)
for fmt, cnt, size in fmt_rows
],
by_metadata_status={status: cnt for status, cnt in status_rows},
by_source={source: cnt for source, cnt in source_rows},
)
async def list( async def list(
self, self,
*, *,
artist_id: uuid.UUID | None, artist_id: uuid.UUID | None,
album_id: uuid.UUID | None, album_id: uuid.UUID | None,
q: str | None, q: str | None,
source: str | None = None,
sort_by: str = "created_at", sort_by: str = "created_at",
order: str = "desc", order: str = "desc",
limit: int = 50, limit: int = 50,
@@ -118,6 +211,8 @@ class SqlAlchemyTrackRepository:
stmt = stmt.where(TrackModel.artist_id == artist_id) stmt = stmt.where(TrackModel.artist_id == artist_id)
if album_id is not None: if album_id is not None:
stmt = stmt.where(TrackModel.album_id == album_id) stmt = stmt.where(TrackModel.album_id == album_id)
if source is not None:
stmt = stmt.where(TrackModel.source == source)
if q: if q:
stmt = stmt.where(TrackModel.title.ilike(f"%{q}%")) stmt = stmt.where(TrackModel.title.ilike(f"%{q}%"))
@@ -142,12 +237,15 @@ class SqlAlchemyTrackRepository:
artist_id: uuid.UUID | None, artist_id: uuid.UUID | None,
album_id: uuid.UUID | None, album_id: uuid.UUID | None,
q: str | None, q: str | None,
source: str | None = None,
) -> int: ) -> int:
stmt = select(func.count()).select_from(TrackModel) stmt = select(func.count()).select_from(TrackModel)
if artist_id is not None: if artist_id is not None:
stmt = stmt.where(TrackModel.artist_id == artist_id) stmt = stmt.where(TrackModel.artist_id == artist_id)
if album_id is not None: if album_id is not None:
stmt = stmt.where(TrackModel.album_id == album_id) stmt = stmt.where(TrackModel.album_id == album_id)
if source is not None:
stmt = stmt.where(TrackModel.source == source)
if q: if q:
stmt = stmt.where(TrackModel.title.ilike(f"%{q}%")) stmt = stmt.where(TrackModel.title.ilike(f"%{q}%"))
return (await self._session.execute(stmt)).scalar_one() return (await self._session.execute(stmt)).scalar_one()
@@ -159,6 +257,9 @@ class SqlAlchemyTrackRepository:
title: str | None, title: str | None,
genre: str | None, genre: str | None,
year: int | None, year: int | None,
artist_id: uuid.UUID | None = None,
album_id: uuid.UUID | None = None,
track_number: int | None = None,
) -> Track: ) -> Track:
row = await self._session.get(TrackModel, track_id) row = await self._session.get(TrackModel, track_id)
if row is None: if row is None:
@@ -169,6 +270,12 @@ class SqlAlchemyTrackRepository:
row.genre = genre row.genre = genre
if year is not None: if year is not None:
row.year = year row.year = year
if artist_id is not None:
row.artist_id = artist_id
if album_id is not None:
row.album_id = album_id
if track_number is not None:
row.track_number = track_number
row.metadata_status = "manual" row.metadata_status = "manual"
await self._session.flush() await self._session.flush()
await self._session.refresh(row) await self._session.refresh(row)
@@ -189,6 +296,7 @@ class SqlAlchemyTrackRepository:
acoustid_fingerprint: str | None, acoustid_fingerprint: str | None,
musicbrainz_id: str | None, musicbrainz_id: str | None,
metadata_status: str, metadata_status: str,
metadata_error: str | None = None,
) -> Track: ) -> Track:
row = await self._session.get(TrackModel, track_id) row = await self._session.get(TrackModel, track_id)
if row is None: if row is None:
@@ -197,6 +305,10 @@ class SqlAlchemyTrackRepository:
row.title = title row.title = title
row.artist_id = artist_id row.artist_id = artist_id
row.metadata_status = metadata_status row.metadata_status = metadata_status
# A finished run always stamps outcome: clear/set the reason and mark the
# completion time so the UI can tell "still pending" from "done/failed".
row.metadata_error = metadata_error
row.enriched_at = dt.datetime.now(dt.UTC)
# Nullable extras: fill gaps only — never erase data a prior run found. # Nullable extras: fill gaps only — never erase data a prior run found.
if album_id is not None: if album_id is not None:
row.album_id = album_id row.album_id = album_id
@@ -217,3 +329,16 @@ class SqlAlchemyTrackRepository:
await self._session.flush() await self._session.flush()
await self._session.refresh(row) await self._session.refresh(row)
return _to_entity(row) return _to_entity(row)
async def mark_enrichment_failed(self, track_id: uuid.UUID, *, error: str) -> None:
"""Record that an enrichment run crashed (unexpected exception). Runs in
its own session so the failure is persisted even though the run's own
transaction rolled back. Never overwrites ``manual`` (a no-op then), and
a missing track is a clean no-op."""
row = await self._session.get(TrackModel, track_id)
if row is None or row.metadata_status == "manual":
return
row.metadata_status = "failed"
row.metadata_error = error
row.enriched_at = dt.datetime.now(dt.UTC)
await self._session.flush()
+46 -14
View File
@@ -46,6 +46,18 @@ class AcoustIdHttpClient:
return bool(self._api_key) return bool(self._api_key)
async def lookup(self, fingerprint: Fingerprint) -> RecordingMatch | None: async def lookup(self, fingerprint: Fingerprint) -> RecordingMatch | None:
payload = await self._lookup_raw(fingerprint)
if payload is None:
return None
return _parse_best_match(payload)
async def lookup_all(self, fingerprint: Fingerprint) -> list[RecordingMatch]:
payload = await self._lookup_raw(fingerprint)
if payload is None:
return []
return _parse_matches(payload)
async def _lookup_raw(self, fingerprint: Fingerprint) -> object | None:
if not self._api_key: if not self._api_key:
return None return None
try: try:
@@ -65,13 +77,11 @@ class AcoustIdHttpClient:
}, },
) )
resp.raise_for_status() resp.raise_for_status()
payload = resp.json() return resp.json() # type: ignore[no-any-return]
except (httpx.HTTPError, ValueError): except httpx.HTTPError, ValueError:
log.warning("acoustid_lookup_failed") log.warning("acoustid_lookup_failed")
return None return None
return _parse_best_match(payload)
@classmethod @classmethod
async def _throttle(cls) -> None: async def _throttle(cls) -> None:
async with cls._throttle_lock: async with cls._throttle_lock:
@@ -82,29 +92,47 @@ class AcoustIdHttpClient:
cls._last_call_monotonic = time.monotonic() cls._last_call_monotonic = time.monotonic()
_MAX_MATCHES = 5
def _parse_best_match(payload: object) -> RecordingMatch | None: def _parse_best_match(payload: object) -> RecordingMatch | None:
matches = _parse_matches(payload)
return matches[0] if matches else None
def _parse_matches(payload: object) -> list[RecordingMatch]:
if not isinstance(payload, dict) or payload.get("status") != "ok": if not isinstance(payload, dict) or payload.get("status") != "ok":
return None return []
results = payload.get("results") results = payload.get("results")
if not isinstance(results, list) or not results: if not isinstance(results, list) or not results:
return None return []
# Results are returned best-score-first; take the top scoring one. # Results are returned best-score-first, but sort defensively and cap the
best = max(results, key=lambda r: r.get("score", 0.0) if isinstance(r, dict) else 0.0) # number of candidates surfaced to the editor.
if not isinstance(best, dict): candidates = [r for r in results if isinstance(r, dict)]
return None candidates.sort(key=lambda r: r.get("score", 0.0), reverse=True)
acoustid = best.get("id") matches: list[RecordingMatch] = []
for result in candidates[:_MAX_MATCHES]:
match = _parse_one(result)
if match is not None:
matches.append(match)
return matches
def _parse_one(result: dict[str, object]) -> RecordingMatch | None:
acoustid = result.get("id")
if not isinstance(acoustid, str): if not isinstance(acoustid, str):
return None return None
score = float(best.get("score", 0.0)) score = float(result.get("score", 0.0)) # type: ignore[arg-type]
recording_mbid: str | None = None recording_mbid: str | None = None
release_group_mbid: str | None = None
title: str | None = None title: str | None = None
artist: str | None = None artist: str | None = None
album: str | None = None album: str | None = None
recordings = best.get("recordings") recordings = result.get("recordings")
if isinstance(recordings, list) and recordings and isinstance(recordings[0], dict): if isinstance(recordings, list) and recordings and isinstance(recordings[0], dict):
rec = recordings[0] rec = recordings[0]
recording_mbid = rec.get("id") if isinstance(rec.get("id"), str) else None recording_mbid = rec.get("id") if isinstance(rec.get("id"), str) else None
@@ -115,13 +143,17 @@ def _parse_best_match(payload: object) -> RecordingMatch | None:
artist = name if isinstance(name, str) else None artist = name if isinstance(name, str) else None
groups = rec.get("releasegroups") groups = rec.get("releasegroups")
if isinstance(groups, list) and groups and isinstance(groups[0], dict): if isinstance(groups, list) and groups and isinstance(groups[0], dict):
gtitle = groups[0].get("title") group = groups[0]
gtitle = group.get("title")
album = gtitle if isinstance(gtitle, str) else None album = gtitle if isinstance(gtitle, str) else None
gid = group.get("id")
release_group_mbid = gid if isinstance(gid, str) else None
return RecordingMatch( return RecordingMatch(
acoustid=acoustid, acoustid=acoustid,
score=score, score=score,
recording_mbid=recording_mbid, recording_mbid=recording_mbid,
release_group_mbid=release_group_mbid,
title=title, title=title,
artist=artist, artist=artist,
album=album, album=album,
@@ -0,0 +1,111 @@
"""MutagenCoverExtractor — pulls embedded cover art from a local audio file.
The offline-first cover source (mirrors the tag pre-pass): a well-tagged file
often already carries front-cover artwork (ID3 ``APIC``, FLAC/OGG picture
blocks, MP4 ``covr``). We read it without any network call. Parsing is blocking,
so it runs in a worker thread. Any failure degrades to ``None`` — never raises.
mutagen ships no type stubs, so its objects are handled as ``Any`` and accessed
defensively (``getattr``) — the format zoo doesn't fit one static shape anyway.
"""
import base64
from pathlib import Path
from typing import Any
import anyio
from mutagen import File as MutagenFile # type: ignore[attr-defined]
from mutagen.flac import Picture
from mutagen.mp4 import MP4Cover
from app.core.logging import get_logger
from app.domain.entities.cover import CoverArt
log = get_logger(__name__)
# MP4 cover format flag → MIME (mutagen exposes an int, not a content type).
_MP4_FORMATS: dict[int, str] = {
MP4Cover.FORMAT_JPEG: "image/jpeg",
MP4Cover.FORMAT_PNG: "image/png",
}
_FRONT_COVER = 3 # APIC/Picture "type" value for the front cover
class MutagenCoverExtractor:
"""Implements :class:`app.domain.ports.CoverArtExtractor`."""
async def extract(self, path: Path) -> CoverArt | None:
try:
return await anyio.to_thread.run_sync(self._extract_sync, path)
except Exception:
log.warning("cover_extract_failed", path=str(path))
return None
def _extract_sync(self, path: Path) -> CoverArt | None:
audio: Any = MutagenFile(str(path))
if audio is None:
return None
# FLAC / OGG-FLAC: typed picture blocks on the file object.
pictures = getattr(audio, "pictures", None)
if pictures:
cover = _from_picture(_front_or_first(pictures))
if cover is not None:
return cover
tags = audio.tags
if tags is None:
return None
# MP3 / anything with ID3 frames: APIC frames keyed as "APIC:...".
apics = [frame for frame in tags.values() if frame.__class__.__name__ == "APIC"]
if apics:
cover = _from_picture(_front_or_first(apics))
if cover is not None:
return cover
get = getattr(tags, "get", None)
if get is None:
return None
# MP4 / M4A: "covr" atom holds a list of MP4Cover (a bytes subclass).
covr = get("covr")
if covr:
mp4_cover = covr[0]
content_type = _MP4_FORMATS.get(getattr(mp4_cover, "imageformat", -1), "image/jpeg")
return CoverArt(data=bytes(mp4_cover), content_type=content_type)
# OGG Vorbis: base64 picture block in METADATA_BLOCK_PICTURE.
block = get("metadata_block_picture")
if block:
cover = _from_picture(_decode_vorbis_picture(block[0]))
if cover is not None:
return cover
return None
def _from_picture(picture: Any) -> CoverArt | None:
"""Build a :class:`CoverArt` from a mutagen picture/APIC frame, or ``None``."""
if picture is None:
return None
data = getattr(picture, "data", None)
if not data:
return None
mime = getattr(picture, "mime", None) or "image/jpeg"
return CoverArt(data=bytes(data), content_type=str(mime))
def _front_or_first(pictures: list[Any]) -> Any:
"""Prefer the front-cover picture (type 3), else the first available."""
for pic in pictures:
if getattr(pic, "type", None) == _FRONT_COVER:
return pic
return pictures[0] if pictures else None
def _decode_vorbis_picture(encoded: str) -> Any:
try:
return Picture(base64.b64decode(encoded)) # type: ignore[no-untyped-call]
except Exception:
return None
+83
View File
@@ -0,0 +1,83 @@
"""CoverArtArchiveClient — fetches front cover art from the Cover Art Archive.
The network fallback when a file carries no embedded artwork: given a
MusicBrainz **release-group** id (supplied by the AcoustID lookup), request the
front image from ``coverartarchive.org``. The CAA redirects to the Internet
Archive, so redirects are followed. ``thumbnail`` 500px keeps payloads small.
Graceful degradation (CLAUDE.md): no release-group id → never called; any
network/HTTP error (incl. 404 "no cover") → returns ``None``, never raises. A
small inter-call delay respects the shared MusicBrainz/CAA infrastructure.
"""
import asyncio
import time
import httpx
from app.core.logging import get_logger
from app.domain.entities.cover import CoverArt
log = get_logger(__name__)
_DEFAULT_BASE_URL = "https://coverartarchive.org"
_TIMEOUT_SECONDS = 15.0
_MIN_INTERVAL_SECONDS = 1.0 # CAA piggybacks on MusicBrainz infra; stay polite
_MAX_BYTES = 10 * 1024 * 1024 # ignore absurdly large images
class CoverArtArchiveClient:
"""Implements :class:`app.domain.ports.CoverArtProvider`."""
_throttle_lock = asyncio.Lock()
_last_call_monotonic = 0.0
def __init__(
self,
*,
user_agent: str,
enabled: bool = True,
base_url: str = _DEFAULT_BASE_URL,
) -> None:
self._user_agent = user_agent
self._enabled = enabled
self._base_url = base_url.rstrip("/")
def is_available(self) -> bool:
return self._enabled
async def fetch_release_group(self, release_group_mbid: str) -> CoverArt | None:
if not self._enabled or not release_group_mbid:
return None
url = f"{self._base_url}/release-group/{release_group_mbid}/front-500"
try:
await self._throttle()
async with httpx.AsyncClient(
timeout=_TIMEOUT_SECONDS,
follow_redirects=True,
headers={"User-Agent": self._user_agent},
) as client:
resp = await client.get(url)
if resp.status_code == 404:
return None # no cover for this release group — normal, not an error
resp.raise_for_status()
except httpx.HTTPError:
log.warning("coverart_fetch_failed", release_group=release_group_mbid)
return None
data = resp.content
if not data or len(data) > _MAX_BYTES:
return None
content_type = resp.headers.get("content-type", "image/jpeg").split(";")[0].strip()
if not content_type.startswith("image/"):
return None
return CoverArt(data=data, content_type=content_type)
@classmethod
async def _throttle(cls) -> None:
async with cls._throttle_lock:
elapsed = time.monotonic() - cls._last_call_monotonic
wait = _MIN_INTERVAL_SECONDS - elapsed
if wait > 0:
await asyncio.sleep(wait)
cls._last_call_monotonic = time.monotonic()
+2 -2
View File
@@ -41,7 +41,7 @@ class FpcalcFingerprinter:
) )
async with asyncio.timeout(_TIMEOUT_SECONDS): async with asyncio.timeout(_TIMEOUT_SECONDS):
stdout, _stderr = await proc.communicate() stdout, _stderr = await proc.communicate()
except (TimeoutError, OSError): except TimeoutError, OSError:
log.warning("fpcalc_failed", path=str(path)) log.warning("fpcalc_failed", path=str(path))
return None return None
@@ -53,7 +53,7 @@ class FpcalcFingerprinter:
data = json.loads(stdout) data = json.loads(stdout)
fingerprint = str(data["fingerprint"]) fingerprint = str(data["fingerprint"])
duration = round(float(data["duration"])) duration = round(float(data["duration"]))
except (json.JSONDecodeError, KeyError, ValueError): except json.JSONDecodeError, KeyError, ValueError:
log.warning("fpcalc_bad_output", path=str(path)) log.warning("fpcalc_bad_output", path=str(path))
return None return None
+27 -2
View File
@@ -2,16 +2,18 @@
Built from settings at the composition root. Only sources that are configured Built from settings at the composition root. Only sources that are configured
are registered (e.g. ``local`` appears only when ``LOCAL_MEDIA_IMPORT_PATH`` is are registered (e.g. ``local`` appears only when ``LOCAL_MEDIA_IMPORT_PATH`` is
set), so enumeration reflects what the instance can actually use. set; ``youtube`` only when ``YOUTUBE_ENABLED``), so enumeration reflects what the
instance can actually use.
""" """
from typing import cast from typing import cast
from app.core.config import Settings from app.core.config import Settings
from app.domain.errors import NotFoundError, ValidationError from app.domain.errors import NotFoundError, ValidationError
from app.domain.ports import IndexableSource, SourceBackend from app.domain.ports import FetchableSource, IndexableSource, SearchableSource, SourceBackend
from app.domain.sources import SourceInfo from app.domain.sources import SourceInfo
from app.infrastructure.sources.local_folder import LocalFolderSource from app.infrastructure.sources.local_folder import LocalFolderSource
from app.infrastructure.sources.youtube import YouTubeMusicSource
class SourceRegistry: class SourceRegistry:
@@ -30,6 +32,22 @@ class SourceRegistry:
raise ValidationError(f"Source {name!r} cannot be indexed.") raise ValidationError(f"Source {name!r} cannot be indexed.")
return cast(IndexableSource, backend) return cast(IndexableSource, backend)
def searchable(self, name: str) -> SearchableSource:
backend = self.get(name)
if not hasattr(backend, "search"):
raise ValidationError(f"Source {name!r} cannot be searched.")
return cast(SearchableSource, backend)
def fetchable(self, name: str) -> FetchableSource:
backend = self.get(name)
if not hasattr(backend, "fetch"):
raise ValidationError(f"Source {name!r} cannot download.")
return cast(FetchableSource, backend)
def searchables(self) -> list[SearchableSource]:
"""Every registered source that supports search (for cross-source search)."""
return [cast(SearchableSource, b) for b in self._by_name.values() if hasattr(b, "search")]
def infos(self) -> list[SourceInfo]: def infos(self) -> list[SourceInfo]:
return [backend.info() for backend in self._by_name.values()] return [backend.info() for backend in self._by_name.values()]
@@ -38,4 +56,11 @@ def build_source_registry(settings: Settings) -> SourceRegistry:
backends: list[SourceBackend] = [] backends: list[SourceBackend] = []
if settings.local_media_import_path is not None: if settings.local_media_import_path is not None:
backends.append(LocalFolderSource(settings.local_media_import_path)) backends.append(LocalFolderSource(settings.local_media_import_path))
if settings.youtube_enabled:
backends.append(
YouTubeMusicSource(
cookies_path=settings.youtube_cookies_path,
tmp_dir=settings.upload_tmp_dir,
)
)
return SourceRegistry(backends) return SourceRegistry(backends)
+207
View File
@@ -0,0 +1,207 @@
"""``youtube`` source — YouTube Music search + download (plan §5).
A *fetch* source: it searches YouTube Music (via ``ytmusicapi``, which returns
clean song/artist/album/duration rows) and downloads the chosen item with
``yt-dlp``. The two libraries are synchronous, so every call is bounced to a
worker thread (``anyio.to_thread``); the sync yt-dlp progress hook bridges back
to the async progress callback via ``anyio.from_thread``.
Both libraries are optional dependencies — if either is missing the source is
simply *unavailable* (it never crashes import or the registry; graceful
degradation per CLAUDE.md). The audio stream is stored **as-is** (YouTube serves
lossy Opus/AAC; re-encoding would be lossy→lossy, plan §6.6).
``source_id`` is the YouTube ``videoId`` — stable, so a re-download of the same
id is idempotent and dedups against an existing track.
"""
import functools
import tempfile
from collections.abc import Callable
from pathlib import Path
from typing import Any
import anyio
from app.core.logging import get_logger
from app.domain.ports import ProgressCallback
from app.domain.sources import (
KIND_FETCH,
DownloadResult,
RawMetadata,
SearchResult,
SourceInfo,
)
from app.infrastructure.db.models.enums import TrackSource
log = get_logger(__name__)
# Functions a caller may inject for testing (defaults do the real library work).
SearchFn = Callable[[str, int], list[dict[str, Any]]]
# (video_id, tmp_dir, progress_hook, cookies_path) -> normalized download dict
DownloadFn = Callable[[str, Path, Callable[[dict[str, Any]], None], Path | None], dict[str, Any]]
def _libs_available() -> bool:
try:
import yt_dlp # noqa: F401
import ytmusicapi # noqa: F401
except ImportError:
return False
return True
def _watch_url(video_id: str) -> str:
return f"https://music.youtube.com/watch?v={video_id}"
class YouTubeMusicSource:
"""Implements :class:`app.domain.ports.SearchableSource` and
:class:`~app.domain.ports.FetchableSource`."""
name = TrackSource.YOUTUBE.value
def __init__(
self,
*,
cookies_path: Path | None = None,
tmp_dir: Path | None = None,
search_fn: SearchFn | None = None,
download_fn: DownloadFn | None = None,
) -> None:
self._cookies_path = cookies_path
self._tmp_dir = tmp_dir
self._search_fn = search_fn or _default_search
self._download_fn = download_fn or _default_download
# Only the real library path needs the deps; an injected fn is self-contained.
self._injected = search_fn is not None or download_fn is not None
def info(self) -> SourceInfo:
return SourceInfo(
name=self.name,
label="YouTube Music",
kind=KIND_FETCH,
available=self.is_available(),
)
def is_available(self) -> bool:
return True if self._injected else _libs_available()
async def search(self, query: str, *, limit: int) -> list[SearchResult]:
query = query.strip()
if not query:
return []
try:
rows = await anyio.to_thread.run_sync(functools.partial(self._search_fn, query, limit))
except Exception:
# No results / service down → degrade to empty (plan §5, CLAUDE.md).
log.warning("ytm_search_failed", query=query)
return []
return [r for r in (self._to_result(row) for row in rows) if r is not None]
async def fetch(
self, source_id: str, *, on_progress: ProgressCallback | None = None
) -> DownloadResult:
tmp_dir = self._tmp_dir or Path(tempfile.gettempdir())
def hook(d: dict[str, Any]) -> None:
if on_progress is None or d.get("status") != "downloading":
return
total = d.get("total_bytes") or d.get("total_bytes_estimate")
done = d.get("downloaded_bytes")
if not total or done is None:
return
# Cap below 1.0 — the job only reaches 1.0 once stored + imported.
frac = min(done / total, 0.99)
# Bridge sync hook (worker thread) → async callback (event loop).
anyio.from_thread.run(on_progress, frac)
def _run() -> dict[str, Any]:
return self._download_fn(source_id, tmp_dir, hook, self._cookies_path)
info = await anyio.to_thread.run_sync(_run)
path = Path(info["filepath"])
stat = await anyio.Path(path).stat()
return DownloadResult(
source_id=source_id,
path=path,
file_format=info["file_format"],
file_size=stat.st_size,
bitrate=info.get("bitrate"),
suggested_title=info.get("title") or source_id,
)
async def get_metadata(self, source_id: str) -> RawMetadata | None:
# The search result already carries a usable title/artist, and the
# canonical metadata comes from enrichment (§6.2). A dedicated lookup is
# an optional refinement — skipped for now (returns None gracefully).
return None
def _to_result(self, row: dict[str, Any]) -> SearchResult | None:
video_id = row.get("videoId")
if not video_id:
return None # non-playable row (e.g. a video without audio id)
artists = row.get("artists") or []
artist = ", ".join(a["name"] for a in artists if a.get("name")) or None
album = (row.get("album") or {}).get("name") if isinstance(row.get("album"), dict) else None
thumbnails = row.get("thumbnails") or []
thumbnail = thumbnails[-1].get("url") if thumbnails else None
return SearchResult(
source=self.name,
source_id=str(video_id),
title=row.get("title") or "Unknown",
artist=artist,
album=album,
duration_seconds=row.get("duration_seconds"),
thumbnail_url=thumbnail,
raw=row,
)
def _default_search(query: str, limit: int) -> list[dict[str, Any]]:
"""Real ytmusicapi search (songs only). Runs in a worker thread."""
from ytmusicapi import YTMusic
yt = YTMusic() # unauthenticated: public search needs no login
results: list[dict[str, Any]] = yt.search(query, filter="songs", limit=limit)
return results[:limit]
def _default_download(
video_id: str,
tmp_dir: Path,
progress_hook: Callable[[dict[str, Any]], None],
cookies_path: Path | None,
) -> dict[str, Any]:
"""Real yt-dlp download of the best audio stream. Runs in a worker thread.
Stores the original stream (no transcode — plan §6.3/§6.6). Returns a
normalized dict the adapter maps to :class:`DownloadResult`.
"""
from yt_dlp import YoutubeDL
opts: dict[str, Any] = {
"format": "bestaudio/best",
"outtmpl": str(tmp_dir / "%(id)s.%(ext)s"),
"quiet": True,
"no_warnings": True,
"noprogress": True,
"progress_hooks": [progress_hook],
}
# Use cookies only when the file is actually present: the path can be set
# unconditionally (e.g. a mounted volume that may be empty) and downloads
# still work without it — cookies just unlock age/region-restricted items.
if cookies_path is not None and cookies_path.is_file():
opts["cookiefile"] = str(cookies_path)
with YoutubeDL(opts) as ydl:
info = ydl.extract_info(_watch_url(video_id), download=True)
filepath = Path(ydl.prepare_filename(info))
abr = info.get("abr")
return {
"filepath": filepath,
"file_format": filepath.suffix.lstrip(".").lower() or "m4a",
"bitrate": int(abr) if abr else None,
"title": info.get("title"),
}
+10 -1
View File
@@ -8,7 +8,7 @@ from pathlib import Path
import anyio import anyio
from app.domain.entities.storage import ObjectStat from app.domain.entities.storage import DiskUsage, ObjectStat
from app.domain.errors import StorageError from app.domain.errors import StorageError
_EXT_CONTENT_TYPE: dict[str, str] = { _EXT_CONTENT_TYPE: dict[str, str] = {
@@ -78,6 +78,15 @@ class LocalFileStorage:
async def delete(self, key: str) -> None: async def delete(self, key: str) -> None:
(self._media_path / key).unlink(missing_ok=True) (self._media_path / key).unlink(missing_ok=True)
async def disk_usage(self) -> DiskUsage | None:
# The media root may not exist yet on a fresh instance — walk up to the
# nearest existing ancestor so we still report the underlying volume.
path = self._media_path
while not path.exists() and path != path.parent:
path = path.parent
usage = await anyio.to_thread.run_sync(shutil.disk_usage, str(path))
return DiskUsage(total=usage.total, used=usage.used, free=usage.free)
def as_local_path(self, key: str) -> AbstractAsyncContextManager[Path]: def as_local_path(self, key: str) -> AbstractAsyncContextManager[Path]:
return self._as_local_path_cm(key) return self._as_local_path_cm(key)
+5 -3
View File
@@ -84,9 +84,7 @@ class S3FileStorage:
async def _stream() -> AsyncGenerator[bytes]: async def _stream() -> AsyncGenerator[bytes]:
async with self._client() as s3: async with self._client() as s3:
try: try:
resp = await s3.get_object( resp = await s3.get_object(Bucket=_bucket, Key=_key, Range=range_header)
Bucket=_bucket, Key=_key, Range=range_header
)
except ClientError as exc: except ClientError as exc:
raise StorageError(str(exc)) from exc raise StorageError(str(exc)) from exc
body = resp["Body"] body = resp["Body"]
@@ -129,6 +127,10 @@ class S3FileStorage:
except ClientError as exc: except ClientError as exc:
raise StorageError(str(exc)) from exc raise StorageError(str(exc)) from exc
async def disk_usage(self) -> None:
# Object stores have no fixed-capacity volume to report.
return None
def as_local_path(self, key: str) -> AbstractAsyncContextManager[Path]: def as_local_path(self, key: str) -> AbstractAsyncContextManager[Path]:
return self._as_local_path_cm(key) return self._as_local_path_cm(key)
+2 -2
View File
@@ -10,7 +10,7 @@ from app.api.health import router as health_router
from app.api.middleware import CorrelationIdMiddleware from app.api.middleware import CorrelationIdMiddleware
from app.api.rest import subsonic_router from app.api.rest import subsonic_router
from app.api.v1 import api_v1_router from app.api.v1 import api_v1_router
from app.core.config import get_settings from app.core.config import app_version, get_settings
from app.core.logging import configure_logging, get_logger from app.core.logging import configure_logging, get_logger
from app.infrastructure.cache import close_redis from app.infrastructure.cache import close_redis
from app.infrastructure.db import dispose_engine from app.infrastructure.db import dispose_engine
@@ -34,7 +34,7 @@ def create_app() -> FastAPI:
app = FastAPI( app = FastAPI(
title="mcma-backend", title="mcma-backend",
version="0.1.0", version=app_version(),
summary="Self-hosted, offline-first music service.", summary="Self-hosted, offline-first music service.",
lifespan=lifespan, lifespan=lifespan,
) )
+8 -2
View File
@@ -1,7 +1,6 @@
"""arq worker settings — the queue runtime. Task functions register here. """arq worker settings — the queue runtime. Task functions register here.
Run with: ``arq app.workers.arq_worker.WorkerSettings``. Run with: ``arq app.workers.arq_worker.WorkerSettings``.
Tasks (download, transcode) are appended to ``functions`` in later steps.
""" """
from typing import Any, ClassVar from typing import Any, ClassVar
@@ -10,8 +9,10 @@ from arq.connections import RedisSettings
from app.core.config import get_settings from app.core.config import get_settings
from app.core.logging import configure_logging, get_logger from app.core.logging import configure_logging, get_logger
from app.workers.tasks.download_task import download_track
from app.workers.tasks.enrich_task import enrich_track from app.workers.tasks.enrich_task import enrich_track
from app.workers.tasks.import_task import scan_local_folder from app.workers.tasks.import_task import scan_local_folder
from app.workers.tasks.materialize_task import materialize_track
log = get_logger("worker") log = get_logger("worker")
@@ -27,7 +28,12 @@ async def shutdown(_ctx: dict[str, Any]) -> None:
class WorkerSettings: class WorkerSettings:
functions: ClassVar[list[Any]] = [scan_local_folder, enrich_track] functions: ClassVar[list[Any]] = [
scan_local_folder,
enrich_track,
download_track,
materialize_track,
]
on_startup = startup on_startup = startup
on_shutdown = shutdown on_shutdown = shutdown
max_jobs = get_settings().max_parallel_downloads max_jobs = get_settings().max_parallel_downloads
+25
View File
@@ -34,6 +34,31 @@ async def enqueue(function: str, **kwargs: Any) -> str:
return str(job.job_id) return str(job.job_id)
async def enqueue_download(job_id: uuid.UUID) -> None:
"""Best-effort enqueue of a download job for the worker.
The job row is already persisted as ``queued``, so this is a follow-up, not a
barrier: if the queue is unreachable we log and move on (graceful
degradation) — the job stays ``queued`` and can be retried later. Deferred a
few seconds so the request's DB transaction commits before the worker reads
the row (same reason as :func:`enqueue_enrich`)."""
try:
await enqueue("download_track", job_id=str(job_id), _defer_by=3)
except DependencyUnavailableError:
log.warning("download_enqueue_failed", job_id=str(job_id))
async def enqueue_materialize(job_id: uuid.UUID) -> None:
"""Best-effort enqueue of a materialize job for the worker (plan: Model C
lazy materialization). Same deferred-commit reasoning as
:func:`enqueue_download` — the job row stays ``queued`` and can be retried
if the queue is unreachable."""
try:
await enqueue("materialize_track", job_id=str(job_id), _defer_by=3)
except DependencyUnavailableError:
log.warning("materialize_enqueue_failed", job_id=str(job_id))
async def enqueue_enrich(track_id: uuid.UUID) -> None: async def enqueue_enrich(track_id: uuid.UUID) -> None:
"""Best-effort enqueue of metadata enrichment for a freshly stored track. """Best-effort enqueue of metadata enrichment for a freshly stored track.
+151
View File
@@ -0,0 +1,151 @@
"""arq task: download one queued job through a fetch source (plan §6.1).
Flow: load job → ``downloading`` → ``backend.fetch`` (progress streamed to the
job row) → ``enriching`` → store file + minimal track → ``done`` → enqueue
enrichment. yt-dlp fails often, so a failed fetch retries with exponential
backoff (``download_max_retries``); only after the last try is the job marked
``failed`` with a reason for the §A5 download manager.
Heavy I/O belongs off the request cycle (CLAUDE.md); the HTTP endpoint only
enqueues. The job row tolerates being deleted mid-flight (cancellation) — status
writes against a missing row are no-ops.
"""
import uuid
from typing import Any
from arq import Retry
from app.application.download_service import DownloadService
from app.core.config import get_settings
from app.core.logging import correlation_id, get_logger
from app.domain.entities.download import DownloadJob
from app.domain.errors import NotFoundError, ValidationError
from app.domain.ports import FetchableSource, ProgressCallback
from app.domain.sources import DownloadResult
from app.infrastructure.db import session_scope
from app.infrastructure.db.repositories import (
SqlAlchemyArtistRepository,
SqlAlchemyDownloadJobRepository,
SqlAlchemyTrackRepository,
)
from app.infrastructure.sources.registry import build_source_registry
from app.infrastructure.storage.provider import get_file_storage
from app.workers.queue import enqueue_enrich
log = get_logger("worker.download")
# Exponential backoff between retries: 30s, 60s, 120s … capped.
_BACKOFF_BASE_SECONDS = 30
_BACKOFF_MAX_SECONDS = 600
# Only write progress when it advances by at least this much (avoid hammering
# the DB on every yt-dlp chunk).
_PROGRESS_STEP = 0.01
async def download_track(_ctx: dict[str, Any], *, job_id: str) -> dict[str, Any]:
correlation_id.set(f"dl:{job_id}")
jid = uuid.UUID(job_id)
settings = get_settings()
job = await _load_job(jid)
if job is None:
log.info("download_job_missing", job_id=job_id) # cancelled before pickup
return {"job_id": job_id, "status": "missing"}
registry = build_source_registry(settings)
try:
backend = registry.fetchable(job.source)
except (NotFoundError, ValidationError) as exc:
await _mark_failed(jid, f"Source unavailable: {exc}")
return {"job_id": job_id, "status": "failed"}
if job.source_id is None:
await _mark_failed(jid, "Job has no source_id to download.")
return {"job_id": job_id, "status": "failed"}
await _set_status(jid, "downloading")
try:
result = await _run_fetch(backend, job.source_id, jid)
except Exception as exc:
return await _handle_failure(jid, exc, settings.download_max_retries, job_id)
try:
track_id = await _import_result(jid, job, result)
except Exception as exc:
log.exception("download_import_failed", job_id=job_id)
await _mark_failed(jid, f"Import failed: {type(exc).__name__}: {exc}")
return {"job_id": job_id, "status": "failed"}
await enqueue_enrich(track_id)
log.info("download_complete", job_id=job_id, track_id=str(track_id))
return {"job_id": job_id, "status": "done", "track_id": str(track_id)}
async def _run_fetch(
backend: FetchableSource, source_id: str, jid: uuid.UUID
) -> DownloadResult:
"""Fetch the file, streaming progress into the job row. A single session is
held for the download so progress writes don't churn connections; each
throttled update is committed so API pollers see it."""
async with session_scope() as session:
repo = SqlAlchemyDownloadJobRepository(session)
last = 0.0
async def on_progress(frac: float) -> None:
nonlocal last
if frac - last < _PROGRESS_STEP:
return
last = frac
await repo.set_progress(jid, frac)
await session.commit()
cb: ProgressCallback = on_progress
return await backend.fetch(source_id, on_progress=cb)
async def _import_result(jid: uuid.UUID, job: DownloadJob, result: DownloadResult) -> uuid.UUID:
async with session_scope() as session:
repo = SqlAlchemyDownloadJobRepository(session)
await repo.set_status(jid, status="enriching")
service = DownloadService(
jobs=repo,
tracks=SqlAlchemyTrackRepository(session),
artists=SqlAlchemyArtistRepository(session),
storage=get_file_storage(),
)
track_id = await service.store_result(
source=job.source, result=result, requested_by=job.requested_by
)
await repo.set_status(jid, status="done", track_id=track_id)
return track_id
async def _handle_failure(
jid: uuid.UUID, exc: Exception, max_retries: int, job_id: str
) -> dict[str, Any]:
async with session_scope() as session:
tries = await SqlAlchemyDownloadJobRepository(session).increment_retry(jid)
if tries <= max_retries:
backoff = min(_BACKOFF_BASE_SECONDS * 2 ** (tries - 1), _BACKOFF_MAX_SECONDS)
log.warning("download_retry", job_id=job_id, attempt=tries, defer=backoff)
raise Retry(defer=backoff) from exc
log.exception("download_failed", job_id=job_id)
await _mark_failed(jid, f"Download failed after {tries} attempts: {type(exc).__name__}: {exc}")
return {"job_id": job_id, "status": "failed"}
async def _load_job(jid: uuid.UUID) -> DownloadJob | None:
async with session_scope() as session:
return await SqlAlchemyDownloadJobRepository(session).get_by_id(jid)
async def _set_status(jid: uuid.UUID, status: str) -> None:
async with session_scope() as session:
await SqlAlchemyDownloadJobRepository(session).set_status(jid, status=status)
async def _mark_failed(jid: uuid.UUID, error: str) -> None:
async with session_scope() as session:
await SqlAlchemyDownloadJobRepository(session).set_status(
jid, status="failed", error_message=error
)
+34 -14
View File
@@ -19,6 +19,8 @@ from app.infrastructure.db.repositories import (
SqlAlchemyTrackRepository, SqlAlchemyTrackRepository,
) )
from app.infrastructure.metadata.acoustid import AcoustIdHttpClient from app.infrastructure.metadata.acoustid import AcoustIdHttpClient
from app.infrastructure.metadata.cover_extractor import MutagenCoverExtractor
from app.infrastructure.metadata.coverart import CoverArtArchiveClient
from app.infrastructure.metadata.fingerprint import FpcalcFingerprinter from app.infrastructure.metadata.fingerprint import FpcalcFingerprinter
from app.infrastructure.metadata.tags import MutagenTagReader from app.infrastructure.metadata.tags import MutagenTagReader
from app.infrastructure.storage.provider import get_file_storage from app.infrastructure.storage.provider import get_file_storage
@@ -28,26 +30,44 @@ log = get_logger("worker.enrich")
async def enrich_track(_ctx: dict[str, Any], *, track_id: str) -> dict[str, Any]: async def enrich_track(_ctx: dict[str, Any], *, track_id: str) -> dict[str, Any]:
settings = get_settings() settings = get_settings()
api_key = ( api_key = settings.acoustid_api_key.get_secret_value() if settings.acoustid_api_key else None
settings.acoustid_api_key.get_secret_value() if settings.acoustid_api_key else None
)
acoustid = AcoustIdHttpClient( acoustid = AcoustIdHttpClient(
api_key=api_key, api_key=api_key,
user_agent=settings.musicbrainz_user_agent, user_agent=settings.musicbrainz_user_agent,
api_url=settings.acoustid_api_url, api_url=settings.acoustid_api_url,
) )
cover_provider = CoverArtArchiveClient(
user_agent=settings.musicbrainz_user_agent,
enabled=settings.coverart_enabled,
base_url=settings.coverart_base_url,
)
async with session_scope() as session: tid = uuid.UUID(track_id)
service = MetadataEnrichmentService( try:
tracks=SqlAlchemyTrackRepository(session), async with session_scope() as session:
artists=SqlAlchemyArtistRepository(session), service = MetadataEnrichmentService(
albums=SqlAlchemyAlbumRepository(session), tracks=SqlAlchemyTrackRepository(session),
storage=get_file_storage(), artists=SqlAlchemyArtistRepository(session),
tag_reader=MutagenTagReader(), albums=SqlAlchemyAlbumRepository(session),
fingerprinter=FpcalcFingerprinter(settings.fpcalc_path), storage=get_file_storage(),
acoustid=acoustid, tag_reader=MutagenTagReader(),
) fingerprinter=FpcalcFingerprinter(settings.fpcalc_path),
result = await service.enrich(uuid.UUID(track_id)) acoustid=acoustid,
cover_extractor=MutagenCoverExtractor(),
cover_provider=cover_provider,
acoustid_trust_score=settings.acoustid_trust_score,
)
result = await service.enrich(tid)
except Exception as exc:
# The run's own transaction rolled back, leaving the track stuck at
# ``pending``. Record the failure in a fresh session so the UI shows a
# ``failed`` status with a reason instead of a silent, endless spinner.
log.exception("enrich_failed", track_id=track_id)
async with session_scope() as session:
await SqlAlchemyTrackRepository(session).mark_enrichment_failed(
tid, error=f"Enrichment crashed: {type(exc).__name__}: {exc}"
)
return {"track_id": track_id, "status": "failed", "mbid": None}
return { return {
"track_id": str(result.track_id), "track_id": str(result.track_id),
+101
View File
@@ -0,0 +1,101 @@
"""arq task: materialize a remote placeholder track (plan: Model C).
Counterpart to ``download_task`` for tracks that were *saved* from a remote
browse hit without audio (``availability="remote"``, ``storage_uri=NULL``).
The job's ``track_id`` already points at the existing placeholder row — on
success the file is stored and ``TrackRepository.materialize`` fills the row
in place (the track's ``id`` never changes), then enrichment is enqueued as
usual.
Shares its fetch/retry/failure machinery with ``download_task`` — only the
"what happens on success" step differs (fill in an existing row vs. create a
new one).
"""
import contextlib
import uuid
from typing import Any
import anyio
from app.core.config import get_settings
from app.core.logging import correlation_id, get_logger
from app.domain.errors import NotFoundError, ValidationError
from app.domain.sources import DownloadResult
from app.infrastructure.db import session_scope
from app.infrastructure.db.repositories import (
SqlAlchemyDownloadJobRepository,
SqlAlchemyTrackRepository,
)
from app.infrastructure.sources.registry import build_source_registry
from app.infrastructure.storage.provider import get_file_storage
from app.workers.queue import enqueue_enrich
from app.workers.tasks.download_task import _handle_failure, _load_job, _mark_failed, _run_fetch
log = get_logger("worker.materialize")
async def materialize_track(_ctx: dict[str, Any], *, job_id: str) -> dict[str, Any]:
correlation_id.set(f"mat:{job_id}")
jid = uuid.UUID(job_id)
settings = get_settings()
job = await _load_job(jid)
if job is None:
log.info("materialize_job_missing", job_id=job_id) # cancelled before pickup
return {"job_id": job_id, "status": "missing"}
if job.track_id is None or job.source_id is None:
await _mark_failed(jid, "Materialize job missing track_id/source_id.")
return {"job_id": job_id, "status": "failed"}
registry = build_source_registry(settings)
try:
backend = registry.fetchable(job.source)
except (NotFoundError, ValidationError) as exc:
await _mark_failed(jid, f"Source unavailable: {exc}")
return {"job_id": job_id, "status": "failed"}
await _set_status(jid, "downloading")
try:
result = await _run_fetch(backend, job.source_id, jid)
except Exception as exc:
return await _handle_failure(jid, exc, settings.download_max_retries, job_id)
try:
await _materialize_result(jid, job.track_id, result)
except Exception as exc:
log.exception("materialize_finalize_failed", job_id=job_id)
await _mark_failed(jid, f"Materialize failed: {type(exc).__name__}: {exc}")
return {"job_id": job_id, "status": "failed"}
await enqueue_enrich(job.track_id)
log.info("materialize_complete", job_id=job_id, track_id=str(job.track_id))
return {"job_id": job_id, "status": "done", "track_id": str(job.track_id)}
async def _materialize_result(jid: uuid.UUID, track_id: uuid.UUID, result: DownloadResult) -> None:
"""Store the downloaded file and fill in the placeholder track in place."""
key = f"tracks/{str(track_id)[:2]}/{track_id}.{result.file_format}"
storage = get_file_storage()
try:
await storage.save_file(key, result.path)
async with session_scope() as session:
job_repo = SqlAlchemyDownloadJobRepository(session)
await job_repo.set_status(jid, status="enriching")
tracks = SqlAlchemyTrackRepository(session)
await tracks.materialize(
track_id,
storage_uri=key,
file_format=result.file_format,
file_size=result.file_size,
bitrate=result.bitrate,
)
await job_repo.set_status(jid, status="done", track_id=track_id)
finally:
with contextlib.suppress(Exception):
await anyio.Path(result.path).unlink(missing_ok=True)
async def _set_status(jid: uuid.UUID, status: str) -> None:
async with session_scope() as session:
await SqlAlchemyDownloadJobRepository(session).set_status(jid, status=status)
+7
View File
@@ -27,6 +27,9 @@ dependencies = [
"httpx>=0.28", "httpx>=0.28",
# embedded audio tag reading (enrichment tag pre-pass) # embedded audio tag reading (enrichment tag pre-pass)
"mutagen>=1.47", "mutagen>=1.47",
# youtube source: search (ytmusicapi) + download (yt-dlp)
"ytmusicapi>=1.8",
"yt-dlp>=2024.12.13",
# S3-compatible object storage # S3-compatible object storage
"aioboto3>=13.0", "aioboto3>=13.0",
# logging # logging
@@ -95,6 +98,10 @@ ignore_missing_imports = true
module = ["aioboto3.*", "aiobotocore.*", "botocore.*"] module = ["aioboto3.*", "aiobotocore.*", "botocore.*"]
ignore_missing_imports = true ignore_missing_imports = true
[[tool.mypy.overrides]]
module = ["ytmusicapi.*", "yt_dlp.*"]
ignore_missing_imports = true
[tool.pytest.ini_options] [tool.pytest.ini_options]
asyncio_mode = "auto" asyncio_mode = "auto"
testpaths = ["tests"] testpaths = ["tests"]
+77
View File
@@ -3,10 +3,28 @@
The ASGI app is driven in-process via httpx + asgi-lifespan (no network, no The ASGI app is driven in-process via httpx + asgi-lifespan (no network, no
running server). DB/Redis-backed integration fixtures arrive with the data running server). DB/Redis-backed integration fixtures arrive with the data
layer (plan §11 step 2). layer (plan §11 step 2).
DB safety
---------
Integration fixtures call ``Base.metadata.drop_all`` / ``create_all`` on
``get_engine()``. That engine is built from ``DATABASE_URL``, which in normal
runs points at the *developer's* database — ``localhost:5432/mcma`` for host
``pytest`` and ``db:5432/mcma`` for ``make test-api`` (which execs ``pytest``
inside the api container). Running the suite there silently wipes real data:
``drop_all`` removes every ORM table while leaving Alembic's ``alembic_version``
(it lives outside ``Base.metadata``) — the exact "tables keep disappearing,
version survives" symptom.
To make that impossible, this module redirects every test to a dedicated
``<db>_test`` database *before settings/engine load*, and creates it on demand.
The real dev database is never opened by the test suite.
""" """
import asyncio
import os import os
from collections.abc import AsyncIterator from collections.abc import AsyncIterator
from pathlib import Path
from urllib.parse import urlsplit, urlunsplit
import pytest import pytest
from asgi_lifespan import LifespanManager from asgi_lifespan import LifespanManager
@@ -17,6 +35,65 @@ os.environ.setdefault("ENVIRONMENT", "test")
os.environ.setdefault("JWT_SECRET", "test-secret") os.environ.setdefault("JWT_SECRET", "test-secret")
def _base_database_url() -> str:
"""Resolve the DB URL the app *would* use, mirroring pydantic-settings
precedence: real env var → ``.env`` file → the app's compiled-in default."""
if env := os.environ.get("DATABASE_URL"):
return env
dotenv = Path(__file__).resolve().parents[1] / ".env"
if dotenv.exists():
for raw in dotenv.read_text().splitlines():
line = raw.strip()
if line.startswith("DATABASE_URL=") and not line.startswith("#"):
return line.split("=", 1)[1].strip().strip("\"'")
return "postgresql+asyncpg://mcma:mcma@localhost:5432/mcma"
def _with_database(url: str, name: str) -> str:
"""Return ``url`` with its database name swapped for ``name``."""
return urlunsplit(urlsplit(url)._replace(path=f"/{name}"))
_BASE_DB_URL = _base_database_url()
_BASE_DB_NAME = urlsplit(_BASE_DB_URL).path.lstrip("/")
# Idempotent: if we're already pointed at a *_test DB, keep it as-is.
_TEST_DB_NAME = _BASE_DB_NAME if _BASE_DB_NAME.endswith("_test") else f"{_BASE_DB_NAME}_test"
_TEST_DB_URL = _with_database(_BASE_DB_URL, _TEST_DB_NAME)
# Redirect the whole suite to the test DB before anything reads settings.
os.environ["DATABASE_URL"] = _TEST_DB_URL
async def _create_test_db_if_missing() -> None:
"""Create ``<db>_test`` if the server is reachable. Best-effort: if Postgres
is down the integration fixtures skip on their own reachability probe, so a
failure here must stay silent rather than break unit-only runs."""
import asyncpg # type: ignore[import-untyped] # driver behind postgresql+asyncpg
# asyncpg wants a plain libpq DSN (no SQLAlchemy "+asyncpg" suffix), against
# the always-present ``postgres`` maintenance database.
dsn = _with_database(_TEST_DB_URL, "postgres").replace("+asyncpg", "")
try:
async with asyncio.timeout(5):
conn = await asyncpg.connect(dsn)
except Exception:
return
try:
exists = await conn.fetchval("SELECT 1 FROM pg_database WHERE datname = $1", _TEST_DB_NAME)
if not exists:
# CREATE DATABASE can't run inside a transaction; asyncpg's implicit
# autocommit on a bare connection handles that.
await conn.execute(f'CREATE DATABASE "{_TEST_DB_NAME}"')
finally:
await conn.close()
@pytest.fixture(scope="session", autouse=True)
def _ensure_test_database() -> None:
"""Guarantee the dedicated test database exists once per session."""
asyncio.run(_create_test_db_if_missing())
@pytest.fixture @pytest.fixture
async def client() -> AsyncIterator[AsyncClient]: async def client() -> AsyncIterator[AsyncClient]:
from app.main import create_app from app.main import create_app
+20
View File
@@ -0,0 +1,20 @@
# Test fixtures
## `scarlet_fire_otis_mcdonald.mp3`
"Scarlet Fire" by **Otis McDonald** — a royalty-free / license-free track
(YouTube Audio Library; distributed via Pro-Sound.org). Used as a real-world
audio fixture for the enrichment pipeline.
What makes it a good fixture: its **embedded ID3 tags are junk**
(`title=Sound_13958`, `artist=Music Track`, `album=Музыка`, `genre=Hip Hop & Rap`)
while AcoustID identifies it with very high confidence as *Scarlet Fire /
Otis McDonald*. So it exercises both:
- the offline tag reader (deterministic, always runs), and
- the "trust a high-confidence AcoustID match over junk tags" path
(`acoustid_trust_score`), which only runs when `fpcalc` + an AcoustID API key
+ network are available — see `tests/test_real_file_enrichment.py`.
Because it's license-free, it may also seed a built-in demo track for fresh
instances.
Binary file not shown.
+1
View File
@@ -33,6 +33,7 @@ def test_parses_full_recording() -> None:
assert match.title == "One More Time" assert match.title == "One More Time"
assert match.artist == "Daft Punk" assert match.artist == "Daft Punk"
assert match.album == "Discovery" assert match.album == "Discovery"
assert match.release_group_mbid == "rg1"
assert match.score == 0.97 assert match.score == 0.97
+194
View File
@@ -0,0 +1,194 @@
"""Integration tests for the native cover-art endpoints.
Seeds an album with a stored cover, then exercises the ``/api/v1`` album/track
cover endpoints (token auth, 404 when absent). Requires a reachable Postgres;
skips otherwise.
"""
import asyncio
import os
import uuid
from collections.abc import AsyncIterator
from pathlib import Path
import pytest
from app.core.config import get_settings
from app.infrastructure.db import Base, dispose_engine, get_engine, session_scope
from app.infrastructure.db.repositories import (
SqlAlchemyAlbumRepository,
SqlAlchemyArtistRepository,
SqlAlchemyRefreshTokenRepository,
SqlAlchemyTrackRepository,
SqlAlchemyUserRepository,
)
from app.infrastructure.storage.provider import get_file_storage
from asgi_lifespan import LifespanManager
from httpx import ASGITransport, AsyncClient
pytestmark = pytest.mark.asyncio
# A minimal valid 1x1 PNG.
_PNG_BYTES = bytes.fromhex(
"89504e470d0a1a0a0000000d4948445200000001000000010802000000907753"
"de0000000c4944415408d763f8cfc0f01f0005000155a2b4f60000000049454e44ae426082"
)
_db_reachable_cache: bool | None = None
async def _db_reachable() -> bool:
global _db_reachable_cache
if _db_reachable_cache is not None:
return _db_reachable_cache
from sqlalchemy import text
try:
async with asyncio.timeout(3):
async with get_engine().connect() as conn:
await conn.execute(text("SELECT 1"))
_db_reachable_cache = True
except Exception:
_db_reachable_cache = False
return _db_reachable_cache
async def _seed_album_with_cover(*, with_cover: bool) -> tuple[uuid.UUID, uuid.UUID]:
"""Create an artist + album (+ optional cover file) + track. Returns
``(album_id, track_id)``."""
async with session_scope() as session:
artist = await SqlAlchemyArtistRepository(session).get_or_create("Coverart Artist")
album = await SqlAlchemyAlbumRepository(session).get_or_create(
title="Coverart Album", artist_id=artist.id, year=2020, musicbrainz_id=None
)
track = await SqlAlchemyTrackRepository(session).add(
id=uuid.uuid4(),
title="Coverart Track",
artist_id=artist.id,
storage_uri="tracks/zz/cover-track.mp3",
file_format="mp3",
file_size=10,
source="upload",
source_id="cover-src",
metadata_status="enriched",
added_by=None,
)
# Link the track to the album (add() doesn't take album_id).
await SqlAlchemyTrackRepository(session).apply_enrichment(
track.id,
title="Coverart Track",
artist_id=artist.id,
album_id=album.id,
genre=None,
year=2020,
track_number=1,
duration_seconds=1,
bitrate=None,
acoustid_fingerprint=None,
musicbrainz_id=None,
metadata_status="enriched",
)
if with_cover:
key = f"covers/{album.id}.png"
import tempfile
with tempfile.NamedTemporaryFile(suffix=".png") as tmp:
tmp.write(_PNG_BYTES)
tmp.flush()
await get_file_storage().save_file(key, Path(tmp.name))
await SqlAlchemyAlbumRepository(session).set_cover_path(album.id, key)
return album.id, track.id
@pytest.fixture
async def api(tmp_path: Path) -> AsyncIterator[AsyncClient]:
if not await _db_reachable():
pytest.skip("Postgres not reachable — integration test skipped.")
os.environ["MEDIA_PATH"] = str(tmp_path)
get_settings.cache_clear()
import app.infrastructure.storage.provider as _storage_provider
_storage_provider._storage = None
try:
async with get_engine().begin() as conn:
await conn.run_sync(Base.metadata.drop_all)
await conn.run_sync(Base.metadata.create_all)
from app.application.user_service import UserService
from app.core.security import Argon2PasswordHasher
async with session_scope() as session:
await UserService(
users=SqlAlchemyUserRepository(session),
refresh_tokens=SqlAlchemyRefreshTokenRepository(session),
hasher=Argon2PasswordHasher(),
).create_user(username="testuser", password="testpass1", is_superuser=False)
from app.main import create_app
app = create_app()
async with LifespanManager(app):
transport = ASGITransport(app=app)
async with AsyncClient(transport=transport, base_url="http://test") as client:
yield client
async with get_engine().begin() as conn:
await conn.run_sync(Base.metadata.drop_all)
await dispose_engine()
finally:
_storage_provider._storage = None
os.environ.pop("MEDIA_PATH", None)
get_settings.cache_clear()
async def _login(api: AsyncClient) -> str:
resp = await api.post(
"/api/v1/auth/login", json={"username": "testuser", "password": "testpass1"}
)
assert resp.status_code == 200
return str(resp.json()["access_token"])
async def test_album_cover_served(api: AsyncClient) -> None:
token = await _login(api)
album_id, _ = await _seed_album_with_cover(with_cover=True)
resp = await api.get(f"/api/v1/albums/{album_id}/cover?token={token}")
assert resp.status_code == 200, resp.text
assert resp.headers["content-type"] == "image/png"
assert resp.content == _PNG_BYTES
async def test_track_cover_served_from_album(api: AsyncClient) -> None:
token = await _login(api)
_, track_id = await _seed_album_with_cover(with_cover=True)
resp = await api.get(f"/api/v1/tracks/{track_id}/cover?token={token}")
assert resp.status_code == 200, resp.text
assert resp.headers["content-type"] == "image/png"
assert resp.content == _PNG_BYTES
async def test_album_without_cover_is_404(api: AsyncClient) -> None:
token = await _login(api)
album_id, _ = await _seed_album_with_cover(with_cover=False)
resp = await api.get(f"/api/v1/albums/{album_id}/cover?token={token}")
assert resp.status_code == 404
async def test_cover_requires_auth(api: AsyncClient) -> None:
album_id, _ = await _seed_album_with_cover(with_cover=True)
resp = await api.get(f"/api/v1/albums/{album_id}/cover")
assert resp.status_code == 401
async def test_album_appears_with_has_cover_flag(api: AsyncClient) -> None:
token = await _login(api)
album_id, _ = await _seed_album_with_cover(with_cover=True)
resp = await api.get(f"/api/v1/albums/{album_id}", headers={"Authorization": f"Bearer {token}"})
assert resp.status_code == 200
assert resp.json()["has_cover"] is True
+222
View File
@@ -0,0 +1,222 @@
"""Unit tests for DownloadService — DB-free, in-memory fakes."""
import datetime as dt
import uuid
from pathlib import Path
import pytest
from app.application.download_service import DownloadService
from app.domain.entities import Artist, Track
from app.domain.entities.download import DownloadJob
from app.domain.sources import DownloadResult
pytestmark = pytest.mark.asyncio
class FakeArtistRepo:
async def get_or_create(self, name: str) -> Artist:
now = dt.datetime.now(dt.UTC)
return Artist(
id=uuid.uuid4(),
name=name,
source=None,
source_id=None,
created_at=now,
updated_at=now,
)
class FakeTrackRepo:
def __init__(self) -> None:
self.by_source: dict[tuple[str, str], Track] = {}
self.added: list[Track] = []
async def get_by_source(self, source: str, source_id: str) -> Track | None:
return self.by_source.get((source, source_id))
async def add(self, **kw: object) -> Track:
now = dt.datetime.now(dt.UTC)
track = Track(
id=kw["id"], # type: ignore[arg-type]
title=str(kw["title"]),
artist_id=kw["artist_id"], # type: ignore[arg-type]
album_id=None,
storage_uri=str(kw["storage_uri"]),
file_format=str(kw["file_format"]),
file_size=int(kw["file_size"]), # type: ignore[call-overload]
source=str(kw["source"]),
source_id=str(kw["source_id"]),
duration_seconds=None,
genre=None,
year=None,
track_number=None,
metadata_status=str(kw["metadata_status"]),
metadata_error=None,
enriched_at=None,
availability="local",
created_at=now,
updated_at=now,
)
self.by_source[(track.source, track.source_id)] = track
self.added.append(track)
return track
class FakeStorage:
def __init__(self) -> None:
self.saved: dict[str, Path] = {}
self.deleted: list[str] = []
async def save_file(self, key: str, src_path: Path) -> int:
self.saved[key] = src_path
return 1
async def delete(self, key: str) -> None:
self.deleted.append(key)
class FakeJobRepo:
def __init__(self) -> None:
self.jobs: dict[uuid.UUID, DownloadJob] = {}
self.active: dict[tuple[str, str], DownloadJob] = {}
def _make(self, **kw: object) -> DownloadJob:
now = dt.datetime.now(dt.UTC)
return DownloadJob(
id=uuid.uuid4(),
source=str(kw["source"]),
source_id=kw.get("source_id"), # type: ignore[arg-type]
query=kw.get("query"), # type: ignore[arg-type]
requested_by=kw.get("requested_by"), # type: ignore[arg-type]
status="queued",
progress=0.0,
error_message=None,
retry_count=0,
track_id=None,
created_at=now,
updated_at=now,
)
async def add(self, **kw: object) -> DownloadJob:
job = self._make(**kw)
self.jobs[job.id] = job
return job
async def get_by_id(self, job_id: uuid.UUID) -> DownloadJob | None:
return self.jobs.get(job_id)
async def get_active_for_source(self, source: str, source_id: str) -> DownloadJob | None:
return self.active.get((source, source_id))
async def set_status(self, job_id: uuid.UUID, **kw: object) -> None: ...
async def delete(self, job_id: uuid.UUID) -> None:
self.jobs.pop(job_id, None)
def _service(
*, jobs: FakeJobRepo, tracks: FakeTrackRepo, storage: FakeStorage, enqueued: list[uuid.UUID]
) -> DownloadService:
async def enqueue_download(job_id: uuid.UUID) -> None:
enqueued.append(job_id)
return DownloadService(
jobs=jobs, # type: ignore[arg-type]
tracks=tracks, # type: ignore[arg-type]
artists=FakeArtistRepo(), # type: ignore[arg-type]
storage=storage, # type: ignore[arg-type]
enqueue_download=enqueue_download,
)
def _track(source: str, source_id: str) -> Track:
now = dt.datetime.now(dt.UTC)
return Track(
id=uuid.uuid4(),
title="t",
artist_id=uuid.uuid4(),
album_id=None,
storage_uri="k",
file_format="mp3",
file_size=1,
source=source,
source_id=source_id,
duration_seconds=None,
genre=None,
year=None,
track_number=None,
metadata_status="pending",
metadata_error=None,
enriched_at=None,
availability="local",
created_at=now,
updated_at=now,
)
async def test_request_dedups_against_library() -> None:
jobs, tracks, storage, enq = FakeJobRepo(), FakeTrackRepo(), FakeStorage(), []
tracks.by_source[("youtube", "abc")] = _track("youtube", "abc")
svc = _service(jobs=jobs, tracks=tracks, storage=storage, enqueued=enq)
result = await svc.request(source="youtube", source_id="abc", query=None, requested_by=None)
assert result.already_in_library is True
assert result.track_id is not None
assert result.job is None
assert enq == [] # nothing enqueued
async def test_request_returns_existing_active_job() -> None:
jobs, tracks, storage, enq = FakeJobRepo(), FakeTrackRepo(), FakeStorage(), []
existing = await jobs.add(source="youtube", source_id="abc", query=None, requested_by=None)
jobs.active[("youtube", "abc")] = existing
svc = _service(jobs=jobs, tracks=tracks, storage=storage, enqueued=enq)
result = await svc.request(source="youtube", source_id="abc", query=None, requested_by=None)
assert result.already_in_library is False
assert result.job is not None
assert result.job.id == existing.id
assert enq == [] # not re-enqueued
async def test_request_creates_and_enqueues_new_job() -> None:
jobs, tracks, storage, enq = FakeJobRepo(), FakeTrackRepo(), FakeStorage(), []
svc = _service(jobs=jobs, tracks=tracks, storage=storage, enqueued=enq)
result = await svc.request(
source="youtube", source_id="abc", query="bohemian", requested_by=None
)
assert result.already_in_library is False
assert result.job is not None
assert enq == [result.job.id]
async def test_store_result_imports_and_cleans_temp(tmp_path: Path) -> None:
jobs, tracks, storage, enq = FakeJobRepo(), FakeTrackRepo(), FakeStorage(), []
svc = _service(jobs=jobs, tracks=tracks, storage=storage, enqueued=enq)
audio = tmp_path / "abc.webm"
audio.write_bytes(b"audio" * 20)
result = DownloadResult(
source_id="abc",
path=audio,
file_format="m4a",
file_size=100,
bitrate=160,
suggested_title="Bohemian Rhapsody",
)
track_id = await svc.store_result(source="youtube", result=result, requested_by=None)
assert len(tracks.added) == 1
stored = tracks.added[0]
assert stored.id == track_id
assert stored.source == "youtube"
assert stored.source_id == "abc"
assert stored.metadata_status == "pending"
assert stored.title == "Bohemian Rhapsody"
assert len(storage.saved) == 1
assert not audio.exists() # temp file removed
+360
View File
@@ -0,0 +1,360 @@
"""Integration tests for downloads + external search.
Requires a reachable Postgres; skips otherwise. The download worker task is
invoked directly (no Redis needed) against a fake fetch source, so the full
DB + storage import path is covered without touching the network.
"""
import asyncio
import os
from collections.abc import AsyncIterator
from pathlib import Path
from typing import Any
import pytest
from app.core.config import get_settings
from app.domain.sources import KIND_FETCH, DownloadResult, SearchResult, SourceInfo
from app.infrastructure.db import Base, dispose_engine, get_engine, session_scope
from app.infrastructure.db.repositories import (
SqlAlchemyRefreshTokenRepository,
SqlAlchemyUserRepository,
)
from app.infrastructure.sources.registry import SourceRegistry
from asgi_lifespan import LifespanManager
from httpx import ASGITransport, AsyncClient
pytestmark = pytest.mark.asyncio
_db_reachable_cache: bool | None = None
async def _db_reachable() -> bool:
global _db_reachable_cache
if _db_reachable_cache is not None:
return _db_reachable_cache
from sqlalchemy import text
try:
async with asyncio.timeout(3):
async with get_engine().connect() as conn:
await conn.execute(text("SELECT 1"))
_db_reachable_cache = True
except Exception:
_db_reachable_cache = False
return _db_reachable_cache
class FakeFetchSource:
"""A searchable + fetchable source that writes a local file (no network)."""
name = "youtube"
def __init__(self, tmp_dir: Path) -> None:
self._tmp_dir = tmp_dir
def info(self) -> SourceInfo:
return SourceInfo(name=self.name, label="YouTube Music", kind=KIND_FETCH, available=True)
def is_available(self) -> bool:
return True
async def search(self, query: str, *, limit: int) -> list[SearchResult]:
return [
SearchResult(
source=self.name,
source_id="vid-1",
title=f"{query} song",
artist="Some Artist",
album="Some Album",
duration_seconds=200,
thumbnail_url="http://img/large.jpg",
)
]
async def fetch(self, source_id: str, *, on_progress: Any = None) -> DownloadResult:
path = self._tmp_dir / f"{source_id}.m4a"
path.write_bytes(b"downloaded audio bytes" * 8)
if on_progress is not None:
await on_progress(0.5)
return DownloadResult(
source_id=source_id,
path=path,
file_format="webm",
file_size=path.stat().st_size,
bitrate=160,
suggested_title=f"Title for {source_id}",
)
async def get_metadata(self, source_id: str) -> None:
return None
@pytest.fixture
async def api(tmp_path: Path) -> AsyncIterator[AsyncClient]:
if not await _db_reachable():
pytest.skip("Postgres not reachable — integration test skipped.")
media = tmp_path / "media"
media.mkdir()
os.environ["MEDIA_PATH"] = str(media)
get_settings.cache_clear()
import app.infrastructure.storage.provider as _storage_provider
_storage_provider._storage = None
try:
async with get_engine().begin() as conn:
await conn.run_sync(Base.metadata.drop_all)
await conn.run_sync(Base.metadata.create_all)
from app.application.user_service import UserService
from app.core.security import Argon2PasswordHasher
async with session_scope() as session:
await UserService(
users=SqlAlchemyUserRepository(session),
refresh_tokens=SqlAlchemyRefreshTokenRepository(session),
hasher=Argon2PasswordHasher(),
).create_user(username="admin", password="adminpass1", is_superuser=True)
from app.api.deps import get_source_registry
from app.main import create_app
app = create_app()
# Inject a fake fetch source so search/download never hit the network.
fake_registry = SourceRegistry([FakeFetchSource(tmp_path / "dl")]) # type: ignore[list-item]
(tmp_path / "dl").mkdir()
app.dependency_overrides[get_source_registry] = lambda: fake_registry
async with LifespanManager(app):
transport = ASGITransport(app=app)
async with AsyncClient(transport=transport, base_url="http://test") as client:
yield client
async with get_engine().begin() as conn:
await conn.run_sync(Base.metadata.drop_all)
await dispose_engine()
finally:
_storage_provider._storage = None
os.environ.pop("MEDIA_PATH", None)
get_settings.cache_clear()
async def _login(api: AsyncClient) -> str:
resp = await api.post(
"/api/v1/auth/login", json={"username": "admin", "password": "adminpass1"}
)
assert resp.status_code == 200
return str(resp.json()["access_token"])
async def test_search_aggregates_fetch_sources(api: AsyncClient) -> None:
token = await _login(api)
headers = {"Authorization": f"Bearer {token}"}
resp = await api.get("/api/v1/search", params={"q": "queen"}, headers=headers)
assert resp.status_code == 200
body = resp.json()
assert body["searched_sources"] == ["youtube"]
assert len(body["results"]) == 1
hit = body["results"][0]
assert hit["source"] == "youtube"
assert hit["source_id"] == "vid-1"
assert hit["title"] == "queen song"
async def test_search_reports_library_status(api: AsyncClient) -> None:
"""Remote browse (plan: Model C) — a fresh hit isn't in the library; after
saving it as a placeholder, the same search reports it as such."""
token = await _login(api)
headers = {"Authorization": f"Bearer {token}"}
resp = await api.get("/api/v1/search", params={"q": "queen"}, headers=headers)
hit = resp.json()["results"][0]
assert hit["in_library"] is False
assert hit["track_id"] is None
assert hit["availability"] is None
save = await api.post(
"/api/v1/tracks/remote",
json={
"source": hit["source"],
"source_id": hit["source_id"],
"title": hit["title"],
"artist": hit["artist"],
},
headers=headers,
)
assert save.status_code == 201
saved = save.json()
assert saved["availability"] == "remote"
assert saved["file_format"] is None
resp2 = await api.get("/api/v1/search", params={"q": "queen"}, headers=headers)
hit2 = resp2.json()["results"][0]
assert hit2["in_library"] is True
assert hit2["track_id"] == saved["id"]
assert hit2["availability"] == "remote"
async def test_save_remote_is_idempotent(api: AsyncClient) -> None:
token = await _login(api)
headers = {"Authorization": f"Bearer {token}"}
payload = {"source": "youtube", "source_id": "vid-idem", "title": "A", "artist": "Artist"}
first = await api.post("/api/v1/tracks/remote", json=payload, headers=headers)
second = await api.post("/api/v1/tracks/remote", json=payload, headers=headers)
assert first.status_code == 201
assert second.status_code == 201
assert first.json()["id"] == second.json()["id"]
async def test_materialize_flow(
api: AsyncClient, tmp_path: Path, monkeypatch: pytest.MonkeyPatch
) -> None:
"""Save a placeholder, materialize it on demand, and confirm it streams
afterwards (plan: Model C lazy materialization)."""
token = await _login(api)
headers = {"Authorization": f"Bearer {token}"}
save = await api.post(
"/api/v1/tracks/remote",
json={
"source": "youtube",
"source_id": "vid-mat-1",
"title": "Materialize Me",
"artist": "Artist",
},
headers=headers,
)
track_id = save.json()["id"]
assert save.json()["availability"] == "remote"
# Streaming a placeholder before materialization fails (no audio yet).
stream_before = await api.get(f"/api/v1/stream/{track_id}", headers=headers)
assert stream_before.status_code == 404
materialize = await api.post(f"/api/v1/tracks/{track_id}/materialize", headers=headers)
assert materialize.status_code == 200
body = materialize.json()
assert body["job"] is not None
job_id = body["job"]["id"]
assert body["job"]["track_id"] == track_id
# A second materialize request reuses the same in-flight job.
again = await api.post(f"/api/v1/tracks/{track_id}/materialize", headers=headers)
assert again.json()["job"]["id"] == job_id
# Run the worker task directly (bypasses Redis) with the fake fetch source.
import app.workers.tasks.materialize_task as mat_task
worker_dir = tmp_path / "worker-mat"
worker_dir.mkdir()
fake = SourceRegistry([FakeFetchSource(worker_dir)]) # type: ignore[list-item]
monkeypatch.setattr(mat_task, "build_source_registry", lambda _settings: fake)
result = await mat_task.materialize_track({}, job_id=job_id)
assert result["status"] == "done"
assert result["track_id"] == track_id
got = await api.get(f"/api/v1/tracks/{track_id}", headers=headers)
assert got.json()["availability"] == "local"
assert got.json()["file_format"] == "webm"
# Streaming now works.
stream_after = await api.get(f"/api/v1/stream/{track_id}", headers=headers)
assert stream_after.status_code == 200
# Materializing an already-local track is a no-op.
noop = await api.post(f"/api/v1/tracks/{track_id}/materialize", headers=headers)
assert noop.json()["job"] is None
async def test_source_scoped_search(api: AsyncClient) -> None:
token = await _login(api)
headers = {"Authorization": f"Bearer {token}"}
resp = await api.get("/api/v1/sources/youtube/search", params={"q": "abba"}, headers=headers)
assert resp.status_code == 200
assert resp.json()["results"][0]["title"] == "abba song"
async def test_download_create_list_and_complete(
api: AsyncClient, tmp_path: Path, monkeypatch: pytest.MonkeyPatch
) -> None:
token = await _login(api)
headers = {"Authorization": f"Bearer {token}"}
# Request a download — Redis is absent, so enqueue degrades but the job persists.
create = await api.post(
"/api/v1/downloads",
json={"source": "youtube", "source_id": "vid-1", "query": "queen"},
headers=headers,
)
assert create.status_code == 202
body = create.json()
assert body["already_in_library"] is False
job_id = body["job"]["id"]
assert body["job"]["status"] == "queued"
# It shows up in the listing.
listing = await api.get("/api/v1/downloads", headers=headers)
assert listing.status_code == 200
assert any(j["id"] == job_id for j in listing.json()["items"])
# A duplicate request returns the same in-flight job, not a new one.
dup = await api.post(
"/api/v1/downloads",
json={"source": "youtube", "source_id": "vid-1"},
headers=headers,
)
assert dup.json()["job"]["id"] == job_id
# Run the worker task directly (bypasses Redis) with the fake fetch source.
import app.workers.tasks.download_task as dl_task
worker_dl = tmp_path / "worker-dl"
worker_dl.mkdir()
fake = SourceRegistry([FakeFetchSource(worker_dl)]) # type: ignore[list-item]
monkeypatch.setattr(dl_task, "build_source_registry", lambda _settings: fake)
result = await dl_task.download_track({}, job_id=job_id)
assert result["status"] == "done"
track_id = result["track_id"]
# The job is now done and linked to the imported track.
got = await api.get(f"/api/v1/downloads/{job_id}", headers=headers)
assert got.json()["status"] == "done"
assert got.json()["track_id"] == track_id
# The imported track streams back.
stream = await api.get(f"/api/v1/stream/{track_id}", headers=headers)
assert stream.status_code == 200
assert len(stream.content) > 0
# A new request for the same item now dedups against the library.
again = await api.post(
"/api/v1/downloads",
json={"source": "youtube", "source_id": "vid-1"},
headers=headers,
)
assert again.json()["already_in_library"] is True
assert again.json()["track_id"] == track_id
async def test_cancel_download(api: AsyncClient) -> None:
token = await _login(api)
headers = {"Authorization": f"Bearer {token}"}
create = await api.post(
"/api/v1/downloads",
json={"source": "youtube", "source_id": "vid-cancel"},
headers=headers,
)
job_id = create.json()["job"]["id"]
cancel = await api.delete(f"/api/v1/downloads/{job_id}", headers=headers)
assert cancel.status_code == 204
got = await api.get(f"/api/v1/downloads/{job_id}", headers=headers)
assert got.status_code == 404
+16 -1
View File
@@ -20,7 +20,14 @@ class FakeArtistRepo:
async def get_or_create(self, name: str) -> Artist: async def get_or_create(self, name: str) -> Artist:
if name not in self._by_name: if name not in self._by_name:
now = dt.datetime.now(dt.UTC) now = dt.datetime.now(dt.UTC)
self._by_name[name] = Artist(id=uuid.uuid4(), name=name, created_at=now, updated_at=now) self._by_name[name] = Artist(
id=uuid.uuid4(),
name=name,
source=None,
source_id=None,
created_at=now,
updated_at=now,
)
return self._by_name[name] return self._by_name[name]
@@ -51,7 +58,11 @@ class FakeTrackRepo:
duration_seconds=None, duration_seconds=None,
genre=None, genre=None,
year=None, year=None,
track_number=None,
metadata_status=str(kw["metadata_status"]), metadata_status=str(kw["metadata_status"]),
metadata_error=None,
enriched_at=None,
availability="local",
created_at=now, created_at=now,
updated_at=now, updated_at=now,
) )
@@ -132,7 +143,11 @@ async def test_dedup_skips_already_imported() -> None:
duration_seconds=None, duration_seconds=None,
genre=None, genre=None,
year=None, year=None,
track_number=None,
metadata_status="pending", metadata_status="pending",
metadata_error=None,
enriched_at=None,
availability="local",
created_at=now, created_at=now,
updated_at=now, updated_at=now,
) )
+156
View File
@@ -0,0 +1,156 @@
"""Integration tests for the lazy-materialization foundation:
``TrackRepository.materialize`` and ``Album``/``ArtistRepository.get_or_create_remote``.
Requires a reachable Postgres; skips otherwise (same pattern as
``test_upload_stream_api.py``).
"""
import asyncio
import uuid
from collections.abc import AsyncIterator
import pytest
from app.infrastructure.db import Base, dispose_engine, get_engine, session_scope
from app.infrastructure.db.models.enums import TrackAvailability
from app.infrastructure.db.repositories import (
SqlAlchemyAlbumRepository,
SqlAlchemyArtistRepository,
SqlAlchemyTrackRepository,
)
pytestmark = pytest.mark.asyncio
_db_reachable_cache: bool | None = None
async def _db_reachable() -> bool:
global _db_reachable_cache
if _db_reachable_cache is not None:
return _db_reachable_cache
from sqlalchemy import text
try:
async with asyncio.timeout(3):
async with get_engine().connect() as conn:
await conn.execute(text("SELECT 1"))
_db_reachable_cache = True
except Exception:
_db_reachable_cache = False
return _db_reachable_cache
@pytest.fixture
async def db() -> AsyncIterator[None]:
if not await _db_reachable():
pytest.skip("Postgres not reachable — integration test skipped.")
async with get_engine().begin() as conn:
await conn.run_sync(Base.metadata.drop_all)
await conn.run_sync(Base.metadata.create_all)
yield None
async with get_engine().begin() as conn:
await conn.run_sync(Base.metadata.drop_all)
await dispose_engine()
async def test_placeholder_track_materializes_in_place(db: None) -> None:
"""A remote placeholder (no storage) gets its audio fields filled in by
``materialize`` without changing ``track.id`` — the stable content id that
likes/playlists/queue may already reference."""
async with session_scope() as session:
artists = SqlAlchemyArtistRepository(session)
tracks = SqlAlchemyTrackRepository(session)
artist = await artists.get_or_create("Some Artist")
track_id = uuid.uuid4()
placeholder = await tracks.add(
id=track_id,
title="Remote Track",
artist_id=artist.id,
storage_uri=None,
file_format=None,
file_size=None,
source="youtube",
source_id="abc123",
metadata_status="pending",
added_by=None,
availability=TrackAvailability.REMOTE.value,
)
assert placeholder.availability == "remote"
assert placeholder.storage_uri is None
materialized = await tracks.materialize(
track_id,
storage_uri="tracks/ab/abc123.m4a",
file_format="m4a",
file_size=12345,
bitrate=160,
)
assert materialized.id == track_id
assert materialized.availability == "local"
assert materialized.storage_uri == "tracks/ab/abc123.m4a"
assert materialized.file_format == "m4a"
assert materialized.file_size == 12345
async def test_artist_get_or_create_remote_dedups_by_remote_id(db: None) -> None:
async with session_scope() as session:
artists = SqlAlchemyArtistRepository(session)
first = await artists.get_or_create_remote(
name="Daft Punk", source="youtube", source_id="UCabc"
)
again = await artists.get_or_create_remote(
name="Daft Punk (different display name)", source="youtube", source_id="UCabc"
)
assert first.id == again.id
assert again.source == "youtube"
assert again.source_id == "UCabc"
async def test_artist_get_or_create_remote_binds_existing_local_artist(db: None) -> None:
async with session_scope() as session:
artists = SqlAlchemyArtistRepository(session)
local = await artists.get_or_create("Pink Floyd")
remote = await artists.get_or_create_remote(
name="Pink Floyd", source="youtube", source_id="UCxyz"
)
assert remote.id == local.id
assert remote.source == "youtube"
assert remote.source_id == "UCxyz"
async def test_album_get_or_create_remote_dedups_by_remote_id(db: None) -> None:
async with session_scope() as session:
artists = SqlAlchemyArtistRepository(session)
albums = SqlAlchemyAlbumRepository(session)
artist = await artists.get_or_create("Daft Punk")
first = await albums.get_or_create_remote(
title="Discovery",
artist_id=artist.id,
year=2001,
musicbrainz_id=None,
source="youtube",
source_id="MPREb_abc",
)
again = await albums.get_or_create_remote(
title="Discovery",
artist_id=artist.id,
year=None,
musicbrainz_id=None,
source="youtube",
source_id="MPREb_abc",
)
assert first.id == again.id
assert again.source == "youtube"
assert again.source_id == "MPREb_abc"
assert again.year == 2001
+7 -2
View File
@@ -48,12 +48,17 @@ def test_info_reports_kind_and_availability(tmp_path: Path) -> None:
def test_registry_registers_local_when_path_set(tmp_path: Path) -> None: def test_registry_registers_local_when_path_set(tmp_path: Path) -> None:
registry = build_source_registry(_settings(local_media_import_path=tmp_path)) # Disable youtube to isolate the local-source registration under test.
registry = build_source_registry(
_settings(local_media_import_path=tmp_path, youtube_enabled=False)
)
names = {info.name for info in registry.infos()} names = {info.name for info in registry.infos()}
assert names == {"local"} assert names == {"local"}
assert registry.indexable("local").is_available() is True assert registry.indexable("local").is_available() is True
def test_registry_empty_when_path_unset() -> None: def test_registry_empty_when_path_unset() -> None:
registry = build_source_registry(_settings(local_media_import_path=None)) registry = build_source_registry(
_settings(local_media_import_path=None, youtube_enabled=False)
)
assert registry.infos() == [] assert registry.infos() == []
+198
View File
@@ -0,0 +1,198 @@
"""Integration tests for the metadata-editor endpoints (§A7, §1H).
Requires a reachable Postgres; skips otherwise.
"""
import asyncio
import os
from collections.abc import AsyncIterator
from pathlib import Path
import pytest
from app.core.config import get_settings
from app.infrastructure.db import Base, dispose_engine, get_engine, session_scope
from app.infrastructure.db.repositories import (
SqlAlchemyRefreshTokenRepository,
SqlAlchemyUserRepository,
)
from asgi_lifespan import LifespanManager
from httpx import ASGITransport, AsyncClient
pytestmark = pytest.mark.asyncio
_db_reachable_cache: bool | None = None
async def _db_reachable() -> bool:
global _db_reachable_cache
if _db_reachable_cache is not None:
return _db_reachable_cache
from sqlalchemy import text
try:
async with asyncio.timeout(3):
async with get_engine().connect() as conn:
await conn.execute(text("SELECT 1"))
_db_reachable_cache = True
except Exception:
_db_reachable_cache = False
return _db_reachable_cache
@pytest.fixture
async def api(tmp_path: Path) -> AsyncIterator[AsyncClient]:
if not await _db_reachable():
pytest.skip("Postgres not reachable — integration test skipped.")
os.environ["MEDIA_PATH"] = str(tmp_path)
get_settings.cache_clear()
import app.infrastructure.storage.provider as _storage_provider
_storage_provider._storage = None
try:
async with get_engine().begin() as conn:
await conn.run_sync(Base.metadata.drop_all)
await conn.run_sync(Base.metadata.create_all)
from app.application.user_service import UserService
from app.core.security import Argon2PasswordHasher
async with session_scope() as session:
await UserService(
users=SqlAlchemyUserRepository(session),
refresh_tokens=SqlAlchemyRefreshTokenRepository(session),
hasher=Argon2PasswordHasher(),
).create_user(username="testuser", password="testpass1", is_superuser=False)
from app.main import create_app
app = create_app()
async with LifespanManager(app):
transport = ASGITransport(app=app)
async with AsyncClient(transport=transport, base_url="http://test") as client:
yield client
async with get_engine().begin() as conn:
await conn.run_sync(Base.metadata.drop_all)
await dispose_engine()
finally:
_storage_provider._storage = None
os.environ.pop("MEDIA_PATH", None)
get_settings.cache_clear()
async def _login(api: AsyncClient) -> str:
resp = await api.post(
"/api/v1/auth/login", json={"username": "testuser", "password": "testpass1"}
)
assert resp.status_code == 200
return str(resp.json()["access_token"])
async def _upload(api: AsyncClient, token: str, *, name: str = "song.mp3") -> str:
audio = b"fake audio bytes for metadata test" * 10
resp = await api.post(
"/api/v1/upload",
files={"file": (name, audio, "audio/mpeg")},
headers={"Authorization": f"Bearer {token}"},
)
assert resp.status_code == 200, resp.text
return str(resp.json()["track_id"])
async def test_track_out_includes_genre_year_track_number(api: AsyncClient) -> None:
token = await _login(api)
track_id = await _upload(api, token)
resp = await api.get(f"/api/v1/tracks/{track_id}", headers={"Authorization": f"Bearer {token}"})
assert resp.status_code == 200, resp.text
body = resp.json()
assert "genre" in body
assert "year" in body
assert "track_number" in body
async def test_metadata_matches_degrades_without_acoustid(api: AsyncClient) -> None:
# No ACOUSTID_API_KEY / fpcalc configured in the test environment — the
# endpoint must degrade to an empty list, not error.
token = await _login(api)
track_id = await _upload(api, token)
resp = await api.get(
f"/api/v1/tracks/{track_id}/metadata/matches",
headers={"Authorization": f"Bearer {token}"},
)
assert resp.status_code == 200, resp.text
assert resp.json() == {"items": []}
async def test_metadata_matches_not_found(api: AsyncClient) -> None:
token = await _login(api)
resp = await api.get(
"/api/v1/tracks/00000000-0000-0000-0000-000000000000/metadata/matches",
headers={"Authorization": f"Bearer {token}"},
)
assert resp.status_code == 404
async def test_apply_metadata_updates_fields_and_sets_manual(api: AsyncClient) -> None:
token = await _login(api)
track_id = await _upload(api, token)
headers = {"Authorization": f"Bearer {token}"}
resp = await api.put(
f"/api/v1/tracks/{track_id}/metadata",
json={
"title": "New Title",
"artist_name": "New Artist",
"album_title": "New Album",
"year": 1999,
"genre": "Rock",
"track_number": 3,
},
headers=headers,
)
assert resp.status_code == 200, resp.text
body = resp.json()
assert body["title"] == "New Title"
assert body["artist_name"] == "New Artist"
assert body["album_title"] == "New Album"
assert body["year"] == 1999
assert body["genre"] == "Rock"
assert body["track_number"] == 3
assert body["metadata_status"] == "manual"
# Re-fetch to confirm persistence.
again = await api.get(f"/api/v1/tracks/{track_id}", headers=headers)
assert again.status_code == 200
assert again.json()["title"] == "New Title"
assert again.json()["metadata_status"] == "manual"
async def test_apply_metadata_partial_update(api: AsyncClient) -> None:
token = await _login(api)
track_id = await _upload(api, token)
headers = {"Authorization": f"Bearer {token}"}
resp = await api.put(
f"/api/v1/tracks/{track_id}/metadata",
json={"genre": "Jazz"},
headers=headers,
)
assert resp.status_code == 200, resp.text
body = resp.json()
assert body["genre"] == "Jazz"
assert body["metadata_status"] == "manual"
async def test_apply_metadata_not_found(api: AsyncClient) -> None:
token = await _login(api)
resp = await api.put(
"/api/v1/tracks/00000000-0000-0000-0000-000000000000/metadata",
json={"title": "x"},
headers={"Authorization": f"Bearer {token}"},
)
assert resp.status_code == 404
+238 -5
View File
@@ -15,6 +15,7 @@ import pytest
from app.application.metadata_service import MetadataEnrichmentService from app.application.metadata_service import MetadataEnrichmentService
from app.domain.entities import Artist, Track from app.domain.entities import Artist, Track
from app.domain.entities.album import Album from app.domain.entities.album import Album
from app.domain.entities.cover import CoverArt
from app.domain.entities.metadata import AudioTags, Fingerprint, RecordingMatch from app.domain.entities.metadata import AudioTags, Fingerprint, RecordingMatch
pytestmark = pytest.mark.asyncio pytestmark = pytest.mark.asyncio
@@ -37,7 +38,11 @@ def _track(*, metadata_status: str = "pending", title: str = "raw-stem") -> Trac
duration_seconds=None, duration_seconds=None,
genre=None, genre=None,
year=None, year=None,
track_number=None,
metadata_status=metadata_status, metadata_status=metadata_status,
metadata_error=None,
enriched_at=None,
availability="local",
created_at=now, created_at=now,
updated_at=now, updated_at=now,
) )
@@ -63,12 +68,21 @@ class FakeArtistRepo:
async def get_or_create(self, name: str) -> Artist: async def get_or_create(self, name: str) -> Artist:
self.created.append(name) self.created.append(name)
now = dt.datetime.now(dt.UTC) now = dt.datetime.now(dt.UTC)
return Artist(id=uuid.uuid4(), name=name, created_at=now, updated_at=now) return Artist(
id=uuid.uuid4(),
name=name,
source=None,
source_id=None,
created_at=now,
updated_at=now,
)
class FakeAlbumRepo: class FakeAlbumRepo:
def __init__(self) -> None: def __init__(self, *, cover_path: str | None = None) -> None:
self.created: list[tuple[str, uuid.UUID]] = [] self.created: list[tuple[str, uuid.UUID]] = []
self.covers: dict[uuid.UUID, str] = {}
self._existing_cover = cover_path
async def get_or_create( async def get_or_create(
self, *, title: str, artist_id: uuid.UUID, year: int | None, musicbrainz_id: str | None self, *, title: str, artist_id: uuid.UUID, year: int | None, musicbrainz_id: str | None
@@ -80,18 +94,54 @@ class FakeAlbumRepo:
title=title, title=title,
artist_id=artist_id, artist_id=artist_id,
year=year, year=year,
cover_path=None, cover_path=self._existing_cover,
musicbrainz_id=musicbrainz_id, musicbrainz_id=musicbrainz_id,
source=None,
source_id=None,
created_at=now, created_at=now,
updated_at=now, updated_at=now,
) )
async def set_cover_path(self, album_id: uuid.UUID, cover_path: str) -> None:
self.covers[album_id] = cover_path
class FakeStorage: class FakeStorage:
def __init__(self) -> None:
self.saved: list[str] = []
@asynccontextmanager @asynccontextmanager
async def as_local_path(self, key: str) -> AsyncIterator[Path]: async def as_local_path(self, key: str) -> AsyncIterator[Path]:
yield Path("/tmp") / key yield Path("/tmp") / key
async def save_file(self, key: str, src_path: Path) -> int:
self.saved.append(key)
return 1
class FakeCoverExtractor:
def __init__(self, cover: CoverArt | None) -> None:
self._cover = cover
self.calls = 0
async def extract(self, path: Path) -> CoverArt | None:
self.calls += 1
return self._cover
class FakeCoverProvider:
def __init__(self, cover: CoverArt | None, *, available: bool = True) -> None:
self._cover = cover
self._available = available
self.calls = 0
def is_available(self) -> bool:
return self._available
async def fetch_release_group(self, release_group_mbid: str) -> CoverArt | None:
self.calls += 1
return self._cover
class FakeTagReader: class FakeTagReader:
def __init__(self, tags: AudioTags | None) -> None: def __init__(self, tags: AudioTags | None) -> None:
@@ -126,6 +176,10 @@ class FakeAcoustId:
self.calls += 1 self.calls += 1
return self._match return self._match
async def lookup_all(self, fingerprint: Fingerprint) -> list[RecordingMatch]:
self.calls += 1
return [self._match] if self._match is not None else []
def _service( def _service(
*, *,
@@ -214,6 +268,33 @@ async def test_nothing_found_marks_failed() -> None:
assert applied is not None assert applied is not None
assert applied["artist_id"] == track.artist_id # fallback kept assert applied["artist_id"] == track.artist_id # fallback kept
assert applied["metadata_status"] == "failed" assert applied["metadata_status"] == "failed"
# A failed run records a human-readable reason; here both id steps were
# available, so it's the generic "no match" message.
assert applied["metadata_error"] == "No metadata match found in tags or AcoustID."
async def test_failed_reason_names_unavailable_fingerprinter() -> None:
track = _track()
service, tracks, _, _, _ = _service(track=track, tags=None, fp=None, fp_available=False)
result = await service.enrich(track.id)
assert result.status == "failed"
applied = tracks.applied
assert applied is not None
assert "fingerprinting" in str(applied["metadata_error"])
async def test_successful_enrich_clears_error() -> None:
track = _track()
service, tracks, _, _, _ = _service(track=track, tags=AudioTags(artist="Pink Floyd"))
result = await service.enrich(track.id)
assert result.status == "enriched"
applied = tracks.applied
assert applied is not None
assert applied["metadata_error"] is None
async def test_acoustid_path_fills_when_tags_absent() -> None: async def test_acoustid_path_fills_when_tags_absent() -> None:
@@ -244,13 +325,14 @@ async def test_acoustid_path_fills_when_tags_absent() -> None:
assert "Daft Punk" in artists.created assert "Daft Punk" in artists.created
async def test_tags_win_over_acoustid_for_overlapping_fields() -> None: async def test_tags_win_over_low_confidence_acoustid() -> None:
track = _track() track = _track()
fp = Fingerprint(fingerprint="AQAA", duration_seconds=200) fp = Fingerprint(fingerprint="AQAA", duration_seconds=200)
tags = AudioTags(title="Tagged Title", artist="Tagged Artist") tags = AudioTags(title="Tagged Title", artist="Tagged Artist")
# Below the 0.85 trust threshold → keep tag-first.
match = RecordingMatch( match = RecordingMatch(
acoustid="aid", acoustid="aid",
score=0.9, score=0.5,
recording_mbid="mbid", recording_mbid="mbid",
title="AcoustID Title", title="AcoustID Title",
artist="AcoustID Artist", artist="AcoustID Artist",
@@ -269,6 +351,36 @@ async def test_tags_win_over_acoustid_for_overlapping_fields() -> None:
assert applied["musicbrainz_id"] == "mbid" assert applied["musicbrainz_id"] == "mbid"
async def test_high_confidence_acoustid_wins_over_junk_tags() -> None:
track = _track()
fp = Fingerprint(fingerprint="AQAA", duration_seconds=200)
# The real-world bug: junk embedded tags on a downloaded file vs a near-
# certain acoustic identification. The match must win for the identity.
tags = AudioTags(title="Sound_13958", artist="Music Track", album="Музыка")
match = RecordingMatch(
acoustid="aid",
score=0.98,
recording_mbid="mbid",
release_group_mbid="rg",
title="Scarlet Fire",
artist="Otis McDonald",
album="Scarlet Fire",
)
service, tracks, artists, albums, _acoustid = _service(
track=track, tags=tags, fp=fp, match=match
)
await service.enrich(track.id)
applied = tracks.applied
assert applied is not None
assert applied["title"] == "Scarlet Fire"
assert "Otis McDonald" in artists.created
assert "Music Track" not in artists.created
assert albums.created and albums.created[0][0] == "Scarlet Fire"
assert applied["metadata_status"] == "enriched"
async def test_fingerprint_skipped_when_acoustid_unavailable() -> None: async def test_fingerprint_skipped_when_acoustid_unavailable() -> None:
track = _track() track = _track()
fp = Fingerprint(fingerprint="AQAA", duration_seconds=200) fp = Fingerprint(fingerprint="AQAA", duration_seconds=200)
@@ -281,3 +393,124 @@ async def test_fingerprint_skipped_when_acoustid_unavailable() -> None:
# tags still enrich, but no AcoustID call is attempted # tags still enrich, but no AcoustID call is attempted
assert acoustid.calls == 0 assert acoustid.calls == 0
assert result.status == "enriched" assert result.status == "enriched"
# -- cover-art resolution -----------------------------------------------------
_PNG = CoverArt(data=b"\x89PNG\r\n", content_type="image/png")
_JPG = CoverArt(data=b"\xff\xd8\xff", content_type="image/jpeg")
def _cover_service(
*,
track: Track,
tags: AudioTags | None = None,
match: RecordingMatch | None = None,
fp: Fingerprint | None = None,
extractor: FakeCoverExtractor | None = None,
provider: FakeCoverProvider | None = None,
existing_cover: str | None = None,
) -> tuple[MetadataEnrichmentService, FakeAlbumRepo, FakeStorage]:
albums = FakeAlbumRepo(cover_path=existing_cover)
storage = FakeStorage()
service = MetadataEnrichmentService(
tracks=FakeTrackRepo(track), # type: ignore[arg-type]
artists=FakeArtistRepo(), # type: ignore[arg-type]
albums=albums, # type: ignore[arg-type]
storage=storage, # type: ignore[arg-type]
tag_reader=FakeTagReader(tags), # type: ignore[arg-type]
fingerprinter=FakeFingerprinter(fp), # type: ignore[arg-type]
acoustid=FakeAcoustId(match), # type: ignore[arg-type]
cover_extractor=extractor, # type: ignore[arg-type]
cover_provider=provider, # type: ignore[arg-type]
)
return service, albums, storage
async def test_cover_extracted_from_embedded_art() -> None:
track = _track()
extractor = FakeCoverExtractor(_PNG)
provider = FakeCoverProvider(_JPG)
service, albums, storage = _cover_service(
track=track,
tags=AudioTags(album="The Wall", artist="PF"),
extractor=extractor,
provider=provider,
)
await service.enrich(track.id)
assert extractor.calls == 1
assert provider.calls == 0 # embedded art wins → no network fetch
assert len(albums.covers) == 1
key = next(iter(albums.covers.values()))
assert key.startswith("covers/") and key.endswith(".png")
assert storage.saved == [key]
async def test_cover_falls_back_to_archive() -> None:
track = _track()
extractor = FakeCoverExtractor(None) # no embedded art
provider = FakeCoverProvider(_JPG)
match = RecordingMatch(acoustid="ac", score=1.0, release_group_mbid="rg-123", album="The Wall")
fp = Fingerprint(fingerprint="AQAA", duration_seconds=200)
service, albums, storage = _cover_service(
track=track,
tags=AudioTags(album="The Wall", artist="PF"),
match=match,
fp=fp,
extractor=extractor,
provider=provider,
)
await service.enrich(track.id)
assert extractor.calls == 1
assert provider.calls == 1
key = next(iter(albums.covers.values()))
assert key.endswith(".jpg")
assert storage.saved == [key]
async def test_cover_not_fetched_without_release_group() -> None:
track = _track()
provider = FakeCoverProvider(_JPG)
service, albums, _ = _cover_service(
track=track,
tags=AudioTags(album="The Wall", artist="PF"),
extractor=FakeCoverExtractor(None),
provider=provider,
)
await service.enrich(track.id)
assert provider.calls == 0 # no release-group mbid → nothing to look up
assert albums.covers == {}
async def test_existing_cover_is_not_overwritten() -> None:
track = _track()
extractor = FakeCoverExtractor(_PNG)
service, albums, storage = _cover_service(
track=track,
tags=AudioTags(album="The Wall", artist="PF"),
extractor=extractor,
existing_cover="covers/old.jpg",
)
await service.enrich(track.id)
assert extractor.calls == 0 # album already has a cover → skip entirely
assert albums.covers == {}
assert storage.saved == []
async def test_cover_skipped_when_no_album() -> None:
track = _track()
extractor = FakeCoverExtractor(_PNG)
# no album tag and no match → no album resolved → no cover work
service, _albums, storage = _cover_service(track=track, extractor=extractor)
await service.enrich(track.id)
assert extractor.calls == 0
assert storage.saved == []
+217
View File
@@ -0,0 +1,217 @@
"""Enrichment tests against a real audio file (``tests/fixtures/``).
The fixture "Scarlet Fire" by Otis McDonald carries *junk* embedded tags
(``Sound_13958`` / ``Music Track`` / ``Музыка``) yet is identified by AcoustID
with ~0.99 confidence. That makes it the real-world reproduction of the
"uploaded a track, got the wrong name/artist" bug: tag reading must be exact,
and a high-confidence AcoustID match must override the junk tags.
Two layers:
- The tag-reader test is offline and deterministic — it always runs.
- The AcoustID/identity tests need the ``fpcalc`` binary, an AcoustID API key,
and network. They *skip* (never fail) when those aren't present, honouring the
project rule that the suite never hard-requires network. They do run inside the
api/worker container (``make test-api``), which ships fpcalc + the key.
"""
import uuid
from collections.abc import AsyncIterator
from contextlib import asynccontextmanager
from dataclasses import dataclass, field
from datetime import UTC, datetime
from pathlib import Path
import pytest
from app.application.metadata_service import MetadataEnrichmentService
from app.core.config import get_settings
from app.domain.entities.album import Album
from app.domain.entities.track import Artist, Track
from app.infrastructure.metadata.acoustid import AcoustIdHttpClient
from app.infrastructure.metadata.fingerprint import FpcalcFingerprinter
from app.infrastructure.metadata.tags import MutagenTagReader
pytestmark = pytest.mark.asyncio
FIXTURE = Path(__file__).parent / "fixtures" / "scarlet_fire_otis_mcdonald.mp3"
_settings = get_settings()
_fpcalc = FpcalcFingerprinter(_settings.fpcalc_path)
# Gate for the network/identity tests — present in the container, absent in CI.
requires_acoustid = pytest.mark.skipif(
not (_fpcalc.is_available() and _settings.acoustid_api_key is not None),
reason="needs the fpcalc binary + ACOUSTID_API_KEY (+ network)",
)
def _acoustid_client() -> AcoustIdHttpClient:
key = _settings.acoustid_api_key
return AcoustIdHttpClient(
api_key=key.get_secret_value() if key else None,
user_agent=_settings.musicbrainz_user_agent,
api_url=_settings.acoustid_api_url,
)
# --- offline: tag reading on a real file -----------------------------------
async def test_real_file_embedded_tags_are_read() -> None:
"""The reader extracts the file's actual (junk) embedded tags verbatim —
proving real-file tag parsing works end to end, no network involved."""
assert FIXTURE.exists(), "fixture mp3 missing"
tags = await MutagenTagReader().read(FIXTURE)
assert tags is not None
assert tags.title == "Sound_13958"
assert tags.artist == "Music Track"
assert tags.album == "Музыка"
assert tags.genre == "Hip Hop & Rap"
assert tags.year == 2018
assert tags.duration_seconds == 143
assert tags.bitrate == 128
# --- networked: AcoustID identifies the real recording ---------------------
@requires_acoustid
async def test_real_file_identified_by_acoustid() -> None:
"""fpcalc → AcoustID identifies the real audio as Scarlet Fire / Otis
McDonald with high confidence (despite the junk tags)."""
fingerprint = await _fpcalc.calculate(FIXTURE)
if fingerprint is None:
pytest.skip("fpcalc produced no fingerprint")
match = await _acoustid_client().lookup(fingerprint)
if match is None:
pytest.skip("AcoustID returned no match (network/rate limit?)")
assert match.score >= _settings.acoustid_trust_score
assert match.title == "Scarlet Fire"
assert match.artist == "Otis McDonald"
assert match.recording_mbid is not None
@requires_acoustid
async def test_real_file_enrichment_overrides_junk_tags() -> None:
"""Full pipeline on the real file with the real tag-reader, fingerprinter
and AcoustID client: the high-confidence match wins over the junk embedded
tags, so the track is stored as Scarlet Fire / Otis McDonald."""
track = _pending_track()
tracks = _FakeTrackRepo(track)
artists = _FakeArtistRepo()
albums = _FakeAlbumRepo()
service = MetadataEnrichmentService(
tracks=tracks, # type: ignore[arg-type]
artists=artists, # type: ignore[arg-type]
albums=albums, # type: ignore[arg-type]
storage=_FixtureStorage(), # type: ignore[arg-type]
tag_reader=MutagenTagReader(),
fingerprinter=_fpcalc,
acoustid=_acoustid_client(),
acoustid_trust_score=_settings.acoustid_trust_score,
)
result = await service.enrich(track.id)
if result.status == "failed":
pytest.skip("AcoustID unavailable at run time (network/rate limit?)")
assert result.status == "enriched"
applied = tracks.applied
assert applied is not None
assert applied["title"] == "Scarlet Fire"
assert "Otis McDonald" in artists.created
assert "Music Track" not in artists.created
assert albums.created and albums.created[0][0] == "Scarlet Fire"
# --- minimal in-memory adapters --------------------------------------------
def _pending_track() -> Track:
now = datetime.now(UTC)
return Track(
id=uuid.uuid4(),
title="scarlet_fire_otis_mcdonald", # the upload-time filename stem
artist_id=uuid.uuid4(),
album_id=None,
storage_uri="tracks/sf/scarlet.mp3",
file_format="mp3",
file_size=FIXTURE.stat().st_size,
source="upload",
source_id="sha-real",
duration_seconds=None,
genre=None,
year=None,
track_number=None,
metadata_status="pending",
metadata_error=None,
enriched_at=None,
availability="local",
created_at=now,
updated_at=now,
)
class _FixtureStorage:
@asynccontextmanager
async def as_local_path(self, _key: str) -> AsyncIterator[Path]:
yield FIXTURE
class _FakeTrackRepo:
def __init__(self, track: Track) -> None:
self._track = track
self.applied: dict[str, object] | None = None
async def get_by_id(self, _track_id: uuid.UUID) -> Track:
return self._track
async def apply_enrichment(self, _track_id: uuid.UUID, **kw: object) -> Track:
self.applied = kw
return self._track
@dataclass
class _FakeArtistRepo:
created: list[str] = field(default_factory=list)
async def get_or_create(self, name: str) -> Artist:
self.created.append(name)
now = datetime.now(UTC)
return Artist(
id=uuid.uuid4(),
name=name,
source=None,
source_id=None,
created_at=now,
updated_at=now,
)
@dataclass
class _FakeAlbumRepo:
created: list[tuple[str, uuid.UUID]] = field(default_factory=list)
async def get_or_create(
self, *, title: str, artist_id: uuid.UUID, year: int | None, musicbrainz_id: str | None
) -> Album:
self.created.append((title, artist_id))
now = datetime.now(UTC)
return Album(
id=uuid.uuid4(),
title=title,
artist_id=artist_id,
year=year,
cover_path=None,
musicbrainz_id=musicbrainz_id,
source=None,
source_id=None,
created_at=now,
updated_at=now,
)
async def set_cover_path(self, _album_id: uuid.UUID, _cover_path: str) -> None:
return None
+255
View File
@@ -0,0 +1,255 @@
"""Unit tests for RemoteLibraryService — DB-free, in-memory fakes (plan: Model C
remote browse + lazy materialization)."""
import datetime as dt
import uuid
import pytest
from app.application.remote_library_service import RemoteLibraryService
from app.domain.entities import Artist, DownloadJob, Track
from app.domain.errors import NotFoundError, ValidationError
pytestmark = pytest.mark.asyncio
class FakeArtistRepo:
def __init__(self) -> None:
self._by_name: dict[str, Artist] = {}
async def get_or_create(self, name: str) -> Artist:
if name not in self._by_name:
now = dt.datetime.now(dt.UTC)
self._by_name[name] = Artist(
id=uuid.uuid4(),
name=name,
source=None,
source_id=None,
created_at=now,
updated_at=now,
)
return self._by_name[name]
class FakeTrackRepo:
def __init__(self) -> None:
self.by_id: dict[uuid.UUID, Track] = {}
self.by_source: dict[tuple[str, str], Track] = {}
async def get_by_id(self, track_id: uuid.UUID) -> Track | None:
return self.by_id.get(track_id)
async def get_by_source(self, source: str, source_id: str) -> Track | None:
return self.by_source.get((source, source_id))
async def add(self, **kw: object) -> Track:
now = dt.datetime.now(dt.UTC)
track = Track(
id=kw["id"], # type: ignore[arg-type]
title=str(kw["title"]),
artist_id=kw["artist_id"], # type: ignore[arg-type]
album_id=None,
storage_uri=kw["storage_uri"], # type: ignore[arg-type]
file_format=kw["file_format"], # type: ignore[arg-type]
file_size=kw["file_size"], # type: ignore[arg-type]
source=str(kw["source"]),
source_id=str(kw["source_id"]),
duration_seconds=None,
genre=None,
year=None,
track_number=None,
metadata_status=str(kw["metadata_status"]),
metadata_error=None,
enriched_at=None,
availability=str(kw["availability"]),
created_at=now,
updated_at=now,
)
self.by_id[track.id] = track
self.by_source[(track.source, track.source_id)] = track
return track
def _local_track(source: str = "youtube", source_id: str = "local-1") -> Track:
now = dt.datetime.now(dt.UTC)
return Track(
id=uuid.uuid4(),
title="Already Here",
artist_id=uuid.uuid4(),
album_id=None,
storage_uri="tracks/aa/aa.m4a",
file_format="m4a",
file_size=123,
source=source,
source_id=source_id,
duration_seconds=None,
genre=None,
year=None,
track_number=None,
metadata_status="pending",
metadata_error=None,
enriched_at=None,
availability="local",
created_at=now,
updated_at=now,
)
class FakeJobRepo:
def __init__(self) -> None:
self.jobs: dict[uuid.UUID, DownloadJob] = {}
self.active: dict[tuple[str, str], DownloadJob] = {}
async def add(self, **kw: object) -> DownloadJob:
now = dt.datetime.now(dt.UTC)
job = DownloadJob(
id=uuid.uuid4(),
source=str(kw["source"]),
source_id=kw.get("source_id"), # type: ignore[arg-type]
query=kw.get("query"), # type: ignore[arg-type]
requested_by=kw.get("requested_by"), # type: ignore[arg-type]
status="queued",
progress=0.0,
error_message=None,
retry_count=0,
track_id=None,
created_at=now,
updated_at=now,
)
self.jobs[job.id] = job
return job
async def get_by_id(self, job_id: uuid.UUID) -> DownloadJob | None:
return self.jobs.get(job_id)
async def get_active_for_source(self, source: str, source_id: str) -> DownloadJob | None:
return self.active.get((source, source_id))
async def set_status(self, job_id: uuid.UUID, **kw: object) -> None:
job = self.jobs[job_id]
track_id = kw.get("track_id")
if track_id is not None:
self.jobs[job_id] = DownloadJob(
id=job.id,
source=job.source,
source_id=job.source_id,
query=job.query,
requested_by=job.requested_by,
status=str(kw.get("status", job.status)),
progress=job.progress,
error_message=job.error_message,
retry_count=job.retry_count,
track_id=track_id, # type: ignore[arg-type]
created_at=job.created_at,
updated_at=job.updated_at,
)
def _service(
*,
tracks: FakeTrackRepo,
artists: FakeArtistRepo,
jobs: FakeJobRepo,
enqueued: list[uuid.UUID],
) -> RemoteLibraryService:
async def enqueue_materialize(job_id: uuid.UUID) -> None:
enqueued.append(job_id)
return RemoteLibraryService(
tracks=tracks, # type: ignore[arg-type]
artists=artists, # type: ignore[arg-type]
jobs=jobs, # type: ignore[arg-type]
enqueue_materialize=enqueue_materialize,
)
async def test_save_remote_creates_placeholder() -> None:
tracks, artists, jobs, enq = FakeTrackRepo(), FakeArtistRepo(), FakeJobRepo(), []
service = _service(tracks=tracks, artists=artists, jobs=jobs, enqueued=enq)
track = await service.save_remote(
source="youtube", source_id="abc", title="Bohemian Rhapsody", artist="Queen", added_by=None
)
assert track.availability == "remote"
assert track.storage_uri is None
assert track.file_format is None
assert track.source == "youtube"
assert track.source_id == "abc"
async def test_save_remote_is_idempotent() -> None:
tracks, artists, jobs, enq = FakeTrackRepo(), FakeArtistRepo(), FakeJobRepo(), []
service = _service(tracks=tracks, artists=artists, jobs=jobs, enqueued=enq)
first = await service.save_remote(
source="youtube", source_id="abc", title="A", artist="Queen", added_by=None
)
second = await service.save_remote(
source="youtube", source_id="abc", title="B", artist="Other", added_by=None
)
assert first.id == second.id
assert second.title == "A" # untouched by the second call
async def test_materialize_already_local_is_noop() -> None:
tracks, artists, jobs, enq = FakeTrackRepo(), FakeArtistRepo(), FakeJobRepo(), []
track = _local_track()
tracks.by_id[track.id] = track
service = _service(tracks=tracks, artists=artists, jobs=jobs, enqueued=enq)
outcome = await service.request_materialize(track.id, requested_by=None)
assert outcome.job is None
assert outcome.track.id == track.id
assert enq == []
async def test_materialize_remote_creates_and_enqueues_job() -> None:
tracks, artists, jobs, enq = FakeTrackRepo(), FakeArtistRepo(), FakeJobRepo(), []
service = _service(tracks=tracks, artists=artists, jobs=jobs, enqueued=enq)
track = await service.save_remote(
source="youtube", source_id="abc", title="A", artist="Queen", added_by=None
)
outcome = await service.request_materialize(track.id, requested_by=None)
assert outcome.job is not None
assert outcome.job.source == "youtube"
assert outcome.job.source_id == "abc"
assert outcome.job.track_id == track.id
assert enq == [outcome.job.id]
async def test_materialize_reuses_active_job() -> None:
tracks, artists, jobs, enq = FakeTrackRepo(), FakeArtistRepo(), FakeJobRepo(), []
service = _service(tracks=tracks, artists=artists, jobs=jobs, enqueued=enq)
track = await service.save_remote(
source="youtube", source_id="abc", title="A", artist="Queen", added_by=None
)
existing = await jobs.add(source="youtube", source_id="abc", query=None, requested_by=None)
jobs.active[("youtube", "abc")] = existing
outcome = await service.request_materialize(track.id, requested_by=None)
assert outcome.job is not None
assert outcome.job.id == existing.id
assert enq == [] # not re-enqueued
async def test_materialize_missing_track_raises() -> None:
tracks, artists, jobs, enq = FakeTrackRepo(), FakeArtistRepo(), FakeJobRepo(), []
service = _service(tracks=tracks, artists=artists, jobs=jobs, enqueued=enq)
with pytest.raises(NotFoundError):
await service.request_materialize(uuid.uuid4(), requested_by=None)
async def test_save_remote_requires_source_id() -> None:
tracks, artists, jobs, enq = FakeTrackRepo(), FakeArtistRepo(), FakeJobRepo(), []
service = _service(tracks=tracks, artists=artists, jobs=jobs, enqueued=enq)
with pytest.raises(ValidationError):
await service.save_remote(
source="youtube", source_id=" ", title="A", artist=None, added_by=None
)
+138
View File
@@ -0,0 +1,138 @@
"""Integration tests for the storage statistics endpoint (§A6).
Requires a reachable Postgres; skips otherwise.
"""
import asyncio
import os
from collections.abc import AsyncIterator
from pathlib import Path
import pytest
from app.core.config import get_settings
from app.infrastructure.db import Base, dispose_engine, get_engine, session_scope
from app.infrastructure.db.repositories import (
SqlAlchemyRefreshTokenRepository,
SqlAlchemyUserRepository,
)
from asgi_lifespan import LifespanManager
from httpx import ASGITransport, AsyncClient
pytestmark = pytest.mark.asyncio
_db_reachable_cache: bool | None = None
async def _db_reachable() -> bool:
global _db_reachable_cache
if _db_reachable_cache is not None:
return _db_reachable_cache
from sqlalchemy import text
try:
async with asyncio.timeout(3):
async with get_engine().connect() as conn:
await conn.execute(text("SELECT 1"))
_db_reachable_cache = True
except Exception:
_db_reachable_cache = False
return _db_reachable_cache
@pytest.fixture
async def api(tmp_path: Path) -> AsyncIterator[AsyncClient]:
if not await _db_reachable():
pytest.skip("Postgres not reachable — integration test skipped.")
os.environ["MEDIA_PATH"] = str(tmp_path)
get_settings.cache_clear()
import app.infrastructure.storage.provider as _storage_provider
_storage_provider._storage = None
try:
async with get_engine().begin() as conn:
await conn.run_sync(Base.metadata.drop_all)
await conn.run_sync(Base.metadata.create_all)
from app.application.user_service import UserService
from app.core.security import Argon2PasswordHasher
async with session_scope() as session:
await UserService(
users=SqlAlchemyUserRepository(session),
refresh_tokens=SqlAlchemyRefreshTokenRepository(session),
hasher=Argon2PasswordHasher(),
).create_user(username="testuser", password="testpass1", is_superuser=False)
from app.main import create_app
app = create_app()
async with LifespanManager(app):
transport = ASGITransport(app=app)
async with AsyncClient(transport=transport, base_url="http://test") as client:
yield client
async with get_engine().begin() as conn:
await conn.run_sync(Base.metadata.drop_all)
await dispose_engine()
finally:
_storage_provider._storage = None
os.environ.pop("MEDIA_PATH", None)
get_settings.cache_clear()
async def _login(api: AsyncClient) -> str:
resp = await api.post(
"/api/v1/auth/login", json={"username": "testuser", "password": "testpass1"}
)
assert resp.status_code == 200
return str(resp.json()["access_token"])
async def _upload(api: AsyncClient, token: str, *, name: str) -> None:
# Vary the bytes per file so dedup (by content) keeps them distinct.
audio = (f"fake audio bytes for storage stats test {name}".encode()) * 10
resp = await api.post(
"/api/v1/upload",
files={"file": (name, audio, "audio/mpeg")},
headers={"Authorization": f"Bearer {token}"},
)
assert resp.status_code == 200, resp.text
async def test_storage_stats_requires_auth(api: AsyncClient) -> None:
resp = await api.get("/api/v1/storage")
assert resp.status_code == 401
async def test_storage_stats_empty_library(api: AsyncClient) -> None:
token = await _login(api)
resp = await api.get("/api/v1/storage", headers={"Authorization": f"Bearer {token}"})
assert resp.status_code == 200, resp.text
body = resp.json()
assert body["total_tracks"] == 0
assert body["total_size"] == 0
assert body["by_format"] == []
# Local backend reports a real disk in the test environment.
assert body["disk"] is not None
assert body["disk"]["total"] > 0
async def test_storage_stats_counts_uploads(api: AsyncClient) -> None:
token = await _login(api)
await _upload(api, token, name="one.mp3")
await _upload(api, token, name="two.mp3")
resp = await api.get("/api/v1/storage", headers={"Authorization": f"Bearer {token}"})
assert resp.status_code == 200, resp.text
body = resp.json()
assert body["total_tracks"] == 2
assert body["total_size"] > 0
assert body["total_artists"] >= 1
fmt = {f["file_format"]: f for f in body["by_format"]}
assert "mp3" in fmt
assert fmt["mp3"]["track_count"] == 2
assert sum(body["by_source"].values()) == 2
+23 -3
View File
@@ -175,9 +175,7 @@ async def test_stream_range(api: AsyncClient) -> None:
async def test_stream_not_found(api: AsyncClient) -> None: async def test_stream_not_found(api: AsyncClient) -> None:
token = await _login(api) token = await _login(api)
resp = await api.get( resp = await api.get(f"/api/v1/stream/00000000-0000-0000-0000-000000000000?token={token}")
f"/api/v1/stream/00000000-0000-0000-0000-000000000000?token={token}"
)
assert resp.status_code == 404 assert resp.status_code == 404
@@ -192,3 +190,25 @@ async def test_upload_requires_auth(api: AsyncClient) -> None:
files={"file": ("x.mp3", b"data", "audio/mpeg")}, files={"file": ("x.mp3", b"data", "audio/mpeg")},
) )
assert resp.status_code == 401 assert resp.status_code == 401
async def test_list_tracks_filters_by_source(api: AsyncClient) -> None:
# Uploaded tracks carry source="upload"; `?source=` narrows the list (this
# powers the webui's "Recently uploaded" view, which survives a refresh).
token = await _login(api)
headers = {"Authorization": f"Bearer {token}"}
await api.post(
"/api/v1/upload",
files={"file": ("uploaded.mp3", b"upload bytes" * 80, "audio/mpeg")},
headers=headers,
)
hit = await api.get("/api/v1/tracks", params={"source": "upload"}, headers=headers)
assert hit.status_code == 200, hit.text
items = hit.json()["items"]
assert len(items) == 1
assert items[0]["source"] == "upload"
miss = await api.get("/api/v1/tracks", params={"source": "youtube"}, headers=headers)
assert miss.status_code == 200
assert miss.json()["items"] == []
+135
View File
@@ -0,0 +1,135 @@
"""Unit tests for YouTubeMusicSource + registry (no network, injected libs)."""
from pathlib import Path
from typing import Any
import pytest
from app.core.config import Settings
from app.domain.sources import KIND_FETCH
from app.infrastructure.sources.registry import build_source_registry
from app.infrastructure.sources.youtube import YouTubeMusicSource
pytestmark = pytest.mark.asyncio
def _song_row(**overrides: Any) -> dict[str, Any]:
row: dict[str, Any] = {
"videoId": "abc123",
"title": "Bohemian Rhapsody",
"artists": [{"name": "Queen", "id": "a1"}],
"album": {"name": "A Night at the Opera", "id": "al1"},
"duration_seconds": 354,
"thumbnails": [
{"url": "http://img/small.jpg", "width": 60, "height": 60},
{"url": "http://img/large.jpg", "width": 240, "height": 240},
],
}
row.update(overrides)
return row
def _settings(**overrides: object) -> Settings:
return Settings(**overrides) # type: ignore[arg-type]
async def test_search_maps_ytmusic_rows() -> None:
source = YouTubeMusicSource(search_fn=lambda q, limit: [_song_row()])
[result] = await source.search("queen", limit=10)
assert result.source == "youtube"
assert result.source_id == "abc123"
assert result.title == "Bohemian Rhapsody"
assert result.artist == "Queen"
assert result.album == "A Night at the Opera"
assert result.duration_seconds == 354
assert result.thumbnail_url == "http://img/large.jpg" # last (largest)
async def test_search_joins_multiple_artists_and_tolerates_missing_fields() -> None:
row = _song_row(
artists=[{"name": "Queen"}, {"name": "David Bowie"}],
album=None,
thumbnails=[],
duration_seconds=None,
)
source = YouTubeMusicSource(search_fn=lambda q, limit: [row])
[result] = await source.search("under pressure", limit=10)
assert result.artist == "Queen, David Bowie"
assert result.album is None
assert result.thumbnail_url is None
assert result.duration_seconds is None
async def test_search_drops_rows_without_video_id() -> None:
rows = [_song_row(), _song_row(videoId=None), _song_row(videoId="xyz")]
source = YouTubeMusicSource(search_fn=lambda q, limit: rows)
results = await source.search("q", limit=10)
assert [r.source_id for r in results] == ["abc123", "xyz"]
async def test_search_empty_query_short_circuits() -> None:
called = False
def _search(q: str, limit: int) -> list[dict[str, Any]]:
nonlocal called
called = True
return []
source = YouTubeMusicSource(search_fn=_search)
assert await source.search(" ", limit=10) == []
assert called is False
async def test_search_degrades_to_empty_on_error() -> None:
def _boom(q: str, limit: int) -> list[dict[str, Any]]:
raise RuntimeError("service down")
source = YouTubeMusicSource(search_fn=_boom)
assert await source.search("q", limit=10) == []
async def test_fetch_maps_download_result(tmp_path: Path) -> None:
audio = tmp_path / "abc123.m4a"
audio.write_bytes(b"opus-bytes" * 10)
def _download(video_id: str, tmp_dir: Path, hook: Any, cookies: Path | None) -> dict[str, Any]:
return {
"filepath": audio,
"file_format": "m4a",
"bitrate": 160,
"title": "Bohemian Rhapsody",
}
source = YouTubeMusicSource(download_fn=_download)
result = await source.fetch("abc123")
assert result.source_id == "abc123"
assert result.path == audio
assert result.file_format == "m4a"
assert result.file_size == len(b"opus-bytes" * 10)
assert result.bitrate == 160
assert result.suggested_title == "Bohemian Rhapsody"
async def test_info_and_availability_with_injected_fn() -> None:
source = YouTubeMusicSource(search_fn=lambda q, limit: [])
info = source.info()
assert info.name == "youtube"
assert info.kind == KIND_FETCH
assert info.available is True # injected fn → treated as available
async def test_registry_registers_youtube_when_enabled() -> None:
registry = build_source_registry(_settings(youtube_enabled=True))
names = {info.name for info in registry.infos()}
assert "youtube" in names
# youtube is searchable + fetchable, not indexable
assert registry.searchable("youtube").name == "youtube"
assert registry.fetchable("youtube").name == "youtube"
async def test_registry_omits_youtube_when_disabled() -> None:
registry = build_source_registry(_settings(youtube_enabled=False))
names = {info.name for info in registry.infos()}
assert "youtube" not in names
Generated
+836 -762
View File
File diff suppressed because it is too large Load Diff