feat: implement pngx-controller with Gitea CI/CD deployment
All checks were successful
Deploy / deploy (push) Successful in 30s

- Full FastAPI sync engine: master→replica document sync via paperless REST API
- Web UI: dashboard, replicas, logs, settings (Jinja2 + HTMX + Pico CSS)
- APScheduler background sync, SSE live log stream, Prometheus metrics
- Fernet encryption for API tokens at rest
- pngx.env credential file: written on save, pre-fills forms on load
- Dockerfile with layer-cached uv build, Python healthcheck
- docker-compose with host networking for Tailscale access
- Gitea Actions workflow: version bump, secret injection, docker compose deploy

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-22 17:59:25 +01:00
parent 942482daab
commit b99dbf694d
40 changed files with 4184 additions and 0 deletions

72
AGENTS.md Normal file
View File

@@ -0,0 +1,72 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
**pngx-controller** is a Paperless-ngx Central Sync Controller: a single-container FastAPI service that reads from a master paperless-ngx instance and syncs documents/metadata to one or more replicas using only the public paperless REST API. Master always wins; replicas are read-only by convention.
The full specification is in `pngx-controller-prd.md`.
## Tech Stack
| Layer | Choice |
|---|---|
| Backend | Python / FastAPI |
| Scheduler | APScheduler (runs inside FastAPI event loop) |
| Frontend | Jinja2 templates + HTMX + Pico CSS (no JS build step) |
| Database | SQLite via SQLModel |
| Auth | Authentik forward auth via `X-authentik-*` headers (no app-level auth code) |
| Transport | Tailscale IPs (bypasses Traefik/public internet) |
## Architecture
Single process, single container. APScheduler runs the sync job inside the FastAPI event loop. An `asyncio.Lock` prevents concurrent sync runs.
```
FastAPI app
├── Web UI (Jinja2 + HTMX) — /, /replicas, /logs, /settings
├── REST API — /api/*
└── APScheduler — sync job every N minutes (default: 15)
SQLite (bind-mounted at /data/db.sqlite3)
├── replicas — configured instances (URL + encrypted API token)
├── sync_map — master_doc_id ↔ replica_doc_id mapping per replica
├── sync_runs — audit log of sync cycles
├── logs — per-document sync events
└── settings — master_url, master_token, sync_interval_seconds, log_retention_days
```
API tokens are encrypted at rest using **Fernet** with `SECRET_KEY` from the environment.
## Sync Engine Logic
1. Acquire `asyncio.Lock` (skip if already running)
2. For each enabled replica:
- `ensure_schema_parity`: sync tags, correspondents, document types, custom fields **by name** (IDs differ per instance; name→id mapping is built and used locally)
- Fetch master docs modified since last sync (`?modified__gte=...`)
- For each doc: download file + metadata from master, then create (`POST /api/documents/post_document/`) or update (`PUT /api/documents/{replica_id}/`) on replica
- Skip file re-upload if SHA256 checksum matches `sync_map.file_checksum`
3. Advance `last_sync_ts`, close `sync_run` record, release lock
Schema parity must be established **before** document sync so custom fields exist on the replica.
## Key Design Constraints
- **No consume directory** — sync via REST API only; consume causes re-OCR, ID collisions, and breaks on live instances
- **No changes to paperless containers** — controller is fully external
- **SPOF accepted** — if controller is down, paperless instances run normally; sync resumes on recovery
- Live log tail uses **SSE** at `/api/logs/stream`; dashboard sync progress uses HTMX polling `/api/sync/running`
## Environment Variables
| Variable | Purpose |
|---|---|
| `SECRET_KEY` | Fernet key for encrypting API tokens at rest |
| `DATABASE_URL` | SQLite path, e.g. `sqlite:////data/db.sqlite3` |
## Implementation Phases (from PRD)
- **Phase 1 (MVP):** SQLModel schema, settings/replica CRUD, sync engine, APScheduler, basic dashboard, log table
- **Phase 2:** SSE log stream, sync progress indicator, manual trigger with live feedback
- **Phase 3:** Full resync per replica, deletion propagation (tombstone table), file checksum skip, alert webhooks