diff --git a/TWO_INSTANCE_SYNC_PRD.md b/TWO_INSTANCE_SYNC_PRD.md new file mode 100644 index 0000000..9940cf3 --- /dev/null +++ b/TWO_INSTANCE_SYNC_PRD.md @@ -0,0 +1,277 @@ +# PRD: Two-Instance Outline Sync + +## Overview + +Extend outline-sync to support bidirectional comparison and synchronization between two live Outline instances: a **remote** instance (accessed via Tailscale) and a **local** instance (accessed directly). The goal is to bring both instances to identical content state after resolving all differences interactively. + +--- + +## Current State + +The current tool performs a **one-way pull** from the remote Outline instance into a local Git-backed vault (Obsidian-compatible markdown). This vault acts as a local representation of the remote instance's content. + +--- + +## Goal + +Enable a four-step workflow: + +1. **Pull** from remote instance into local vault (existing) +2. **Fetch** from local Outline instance and detect differences vs. vault +3. **Review** differences interactively (left = remote, right = local) +4. **Resolve** all differences and push chosen versions to both instances + +End state: both Outline instances contain identical content. + +--- + +## Instances + +| Side | Label | URL | Token | +|---|---|---|---| +| Left (Remote) | `remote` | `http://outline:3000` (Tailscale) | From `settings.json` | +| Right (Local) | `local` | `http://outline-web:3000` (Docker container) | `ol_api_PBU1EX6aRlUVkXzD995oGpMOmYOXpdXEWeEQll` | + +--- + +## Document Matching Strategy + +Since the two Outline instances assign independent document IDs, matching must be **by content path** — the canonical path is `Collection / Parent Chain / Document Title` derived from the navigation tree. + +**Match key**: `collection_name / parent_title / ... / document_title` (case-insensitive, normalized) + +**Match outcomes per document:** + +| Outcome | Condition | +|---|---| +| `in_sync` | Match found, content and `updatedAt` identical | +| `remote_only` | Path exists on remote, not found on local | +| `local_only` | Path exists on local, not found on remote | +| `conflict` | Path exists on both, content differs | + +--- + +## Step 2 — Fetch & Diff + +### Process + +1. Pull from remote into vault (existing Step 1). +2. Instantiate a second `OutlineSync` client for the local instance. +3. Fetch all collections + document trees from local instance. +4. For each document on either side, compute the match key. +5. Build a **diff report**: a structured list of `DiffEntry` objects. + +### DiffEntry Model + +```python +@dataclass +class DiffEntry: + match_key: str # Canonical path used for matching + status: Literal["in_sync", "remote_only", "local_only", "conflict"] + remote_doc: DocMeta | None # None if remote_only is False + local_doc: DocMeta | None # None if local_only is False + +@dataclass +class DocMeta: + instance: str # "remote" | "local" + doc_id: str + title: str + collection_id: str + collection_name: str + parent_id: str | None + updated_at: str # ISO timestamp + content: str # Full markdown body +``` + +### CLI Output (Step 2) + +``` +Fetching remote... ✓ (47 documents in 6 collections) +Fetching local... ✓ (39 documents in 5 collections) + +Diff summary: + In sync: 35 documents + Remote only: 8 documents + Local only: 4 documents + Conflicts: 8 documents +───────────────────────────── + Total to resolve: 20 +``` + +--- + +## Step 3 — Interactive Conflict Resolution + +### Interface Options + +Two interfaces will be supported: + +#### A. Terminal (CLI) — Primary + +An interactive terminal UI (using `rich` or plain TTY prompts) presents each non-`in_sync` document one at a time. + +**Display per item:** + +``` +[8/20] CONFLICT — Projekte / outline-sync / Architecture + + LEFT (remote) updated 2026-03-17T09:12:00Z + RIGHT (local) updated 2026-03-15T14:30:00Z + + --- remote + +++ local + @@ -3,7 +3,9 @@ + ## Overview + -The sync tool uses Git as a local cache. + +The sync tool uses Git as a local cache and Tailscale for connectivity. + + + +Added section on networking. + + Choose action: + [r] Keep remote [l] Keep local [s] Skip [d] Full diff [q] Quit +``` + +**For `remote_only`:** +``` +[3/20] REMOTE ONLY — Bewerbungen / Company XYZ / Cover Letter + + Document exists on remote, not on local. + Choose: [r] Copy to local [s] Skip [q] Quit +``` + +**For `local_only`:** +``` +[5/20] LOCAL ONLY — Projekte / New Project / Notes + + Document exists on local, not on remote. + Choose: [l] Copy to remote [s] Skip [q] Quit +``` + +#### B. Web UI — Optional (Phase 2) + +A web page with a split-panel diff view (similar to a code review UI) with Accept Left / Accept Right / Edit buttons. Built on the existing FastAPI + HTMX stack. + +--- + +## Step 4 — Resolution & Write-Back + +After the user has resolved all items, the tool executes the chosen actions: + +| User Choice | Action | +|---|---| +| Keep remote | Push remote content to local Outline via API (`documents.create` or `documents.update`) | +| Keep local | Push local content to remote Outline via API | +| Copy to local | Create document on local instance with remote content | +| Copy to remote | Create document on remote instance with local content | +| Skip | No action for this document | + +### Write-Back Rules + +- **Collection must exist** on target before creating a document. Auto-create collection if missing. +- **Parent document must exist** on target before creating a child. Create parents recursively (depth-first). +- **Title** and **markdown body** are copied as-is. Frontmatter (`outline_id` etc.) is not transferred — target instance assigns its own IDs. +- **Updated timestamps** are NOT forcibly set — the target instance sets `updatedAt` on create/update. + +### Final Report + +``` +Applying resolutions... + + ✓ Copied to local: 8 documents + ✓ Copied to remote: 4 documents + ✓ Pushed to local: 3 documents (conflict: chose remote) + ✓ Pushed to remote: 5 documents (conflict: chose local) + - Skipped: 0 documents + +Both instances now have 47 documents in 6 collections. +``` + +--- + +## Configuration + +Add a `local` block to `settings.json`: + +```json +{ + "source": { + "url": "http://outline:3000", + "token": "ol_api_..." + }, + "local": { + "url": "http://outline-web:3000", + "token": "ol_api_PBU1EX6aRlUVkXzD995oGpMOmYOXpdXEWeEQll" + } +} +``` + +CLI flags override settings: + +```bash +./sync.sh compare --local-url http://localhost:3000 --local-token ol_api_... +./sync.sh compare --auto-resolve newer # Auto-pick newer timestamp, no prompts +``` + +--- + +## New CLI Commands + +| Command | Description | +|---|---| +| `compare` | Fetch both instances and print diff report (no writes) | +| `merge` | Interactive resolution workflow (Steps 2–4) | +| `merge --auto-resolve newer` | Non-interactive: always pick newer `updatedAt` | +| `merge --auto-resolve remote` | Non-interactive: always keep remote version | +| `merge --auto-resolve local` | Non-interactive: always keep local version | + +--- + +## Non-Goals + +- **Attachment/image sync** — text content only (consistent with existing NG3) +- **Real-time continuous sync** — this is a manual, on-demand operation +- **Three-way merge** — no character-level content merging; user picks one side +- **Deletion propagation** — documents deleted on one side are not auto-deleted on the other; they appear as `remote_only` or `local_only` +- **Comment/reaction sync** — document body only +- **Preserving Outline metadata** (pinned, starred, views) — content only + +--- + +## Implementation Phases + +### Phase 1 — Compare (Step 2 only) +- Add `local` config block +- Implement `LocalOutlineClient` (reuse `OutlineSync` API methods, different credentials) +- Build `DiffReport`: fetch both instances, match by path, classify entries +- `compare` command: print diff summary to terminal +- No writes to either instance + +### Phase 2 — Interactive CLI Merge (Steps 3–4) +- `merge` command with TTY prompts +- `--auto-resolve` flag for non-interactive use +- Write-back: `documents.create` / `documents.update` on target +- Recursive parent creation +- Final report + +### Phase 3 — Web UI Diff View (Optional) +- Split-panel view in FastAPI + HTMX +- Accept Left / Accept Right buttons per document +- Progress indicator during write-back + +--- + +## Success Criteria + +- `compare` correctly identifies all differences between two live Outline instances +- `merge` allows user to resolve every difference and both instances end up with identical document trees +- No data is lost — skipped documents remain on their original instance +- The tool is re-runnable: running `compare` after a successful `merge` reports 0 differences + +--- + +## Decisions + +1. **Rename detection**: Treated as `remote_only` + `local_only` (delete + create). No fuzzy matching. +2. **Collection renames**: Same — treated as entirely separate collections with all-new documents. +3. **Local instance**: Docker container `outline-web`, accessible at `http://outline-web:3000`. +4. **Initial scope**: Phase 1 only (`compare` command, read-only). Phases 2–3 follow after matching logic is validated. diff --git a/outline_sync.py b/outline_sync.py index e1e9870..ba10a2f 100644 --- a/outline_sync.py +++ b/outline_sync.py @@ -178,15 +178,18 @@ class OutlineSync: logger.error("Request failed on %s: %s", endpoint, exc) return None - def health_check(self) -> bool: + def health_check(self, label: str = "") -> bool: parsed = urlparse(self.base_url) host = parsed.hostname or "outline" port = parsed.port or (443 if parsed.scheme == "https" else 80) - print(f"Checking API connectivity to {self.base_url} ...") + prefix = f"[{label}] " if label else "" + print(f"{prefix}Checking API connectivity to {self.base_url} ...") + + indent = " " # 1. DNS resolution - print(f" DNS resolve {host!r} ... ", end="", flush=True) + print(f"{indent}DNS resolve {host!r} ... ", end="", flush=True) try: ip = socket.gethostbyname(host) print(f"✓ ({ip})") @@ -195,7 +198,7 @@ class OutlineSync: return False # 2. TCP reachability - print(f" TCP connect {ip}:{port} ... ", end="", flush=True) + print(f"{indent}TCP connect {ip}:{port} ... ", end="", flush=True) try: with socket.create_connection((host, port), timeout=5): print("✓") @@ -204,7 +207,7 @@ class OutlineSync: return False # 3. API authentication - print(f" API auth ... ", end="", flush=True) + print(f"{indent}API auth ... ", end="", flush=True) result = self._api("/api/auth.info") if result and "data" in result: user = result["data"].get("user", {}) @@ -666,6 +669,130 @@ class OutlineSync: print(f"Done. {summary}.") return errors == 0 + # ── Compare ─────────────────────────────────────────────────────────────── + + def _collect_docs_keyed( + self, + nodes: List[Dict], + coll_name: str, + result: Dict[str, Dict], + parent_path: str = "", + ) -> None: + """Recursively populate result with {match_key: doc_meta} from a nav tree.""" + for node in nodes: + title = node.get("title", "Untitled") + safe_title = sanitize_name(title).lower() + safe_coll = sanitize_name(coll_name).lower() + path = f"{parent_path}/{safe_title}" if parent_path else safe_title + match_key = f"{safe_coll}/{path}" + result[match_key] = { + "id": node["id"], + "title": title, + "updatedAt": node.get("updatedAt", ""), + } + for child in node.get("children", []): + self._collect_docs_keyed([child], coll_name, result, path) + + def _fetch_all_docs_keyed(self) -> Dict[str, Dict]: + """Return {match_key: doc_meta} for every document across all collections.""" + out: Dict[str, Dict] = {} + for coll in self.get_collections(): + tree = self.get_nav_tree(coll["id"]) + self._collect_docs_keyed(tree, coll["name"], out) + return out + + def cmd_compare(self, local_url: str, local_token: str) -> bool: + """ + Fetch both the remote and local Outline instances and print a diff report. + No writes to either instance. + + Match key: collection_name/parent_chain/document_title (lowercased, sanitized) + Status per document: + in_sync — same path, content identical + remote_only — path exists on remote, not on local + local_only — path exists on local, not on remote + conflict — same path, content differs + """ + print("Fetching remote instance...") + if not self.health_check(label="remote"): + print("✗ Cannot reach remote Outline API — aborting.") + return False + remote_docs = self._fetch_all_docs_keyed() + print(f" → {len(remote_docs)} documents\n") + + print("Fetching local instance (outline-web)...") + local_client = OutlineSync(local_url, local_token, self.vault_dir) + if not local_client.health_check(label="local"): + print("✗ Cannot reach local Outline API — aborting.") + return False + local_docs = local_client._fetch_all_docs_keyed() + print(f" → {len(local_docs)} documents\n") + + all_keys = sorted(set(remote_docs) | set(local_docs)) + + results: Dict[str, List] = { + "in_sync": [], + "remote_only": [], + "local_only": [], + "conflict": [], + } + + for key in all_keys: + r = remote_docs.get(key) + l = local_docs.get(key) + + if r and l: + if r["updatedAt"] == l["updatedAt"]: + results["in_sync"].append(key) + else: + # Timestamps differ — compare actual content + r_full = self.get_document_info(r["id"]) + l_full = local_client.get_document_info(l["id"]) + r_text = (r_full or {}).get("text", "") + l_text = (l_full or {}).get("text", "") + if r_text.strip() == l_text.strip(): + results["in_sync"].append(key) + else: + results["conflict"].append((key, r, l)) + elif r: + results["remote_only"].append(key) + else: + results["local_only"].append(key) + + in_sync = len(results["in_sync"]) + remote_only = len(results["remote_only"]) + local_only = len(results["local_only"]) + conflicts = len(results["conflict"]) + total = remote_only + local_only + conflicts + + print("Diff summary:") + print(f" In sync: {in_sync:4d} documents") + print(f" Remote only: {remote_only:4d} documents") + print(f" Local only: {local_only:4d} documents") + print(f" Conflicts: {conflicts:4d} documents") + print( " " + "─" * 30) + print(f" Total to resolve: {total}") + + if results["remote_only"]: + print("\nRemote only:") + for key in results["remote_only"]: + print(f" + {key}") + + if results["local_only"]: + print("\nLocal only:") + for key in results["local_only"]: + print(f" + {key}") + + if results["conflict"]: + print("\nConflicts:") + for key, r, l in results["conflict"]: + r_ts = r["updatedAt"][:19] if r["updatedAt"] else "?" + l_ts = l["updatedAt"][:19] if l["updatedAt"] else "?" + print(f" ~ {key}") + print(f" remote: {r_ts} local: {l_ts}") + + return True + # ── Commands ────────────────────────────────────────────────────────────── def cmd_init(self) -> bool: @@ -741,18 +868,23 @@ def load_settings(path: str) -> Dict: def parse_args() -> argparse.Namespace: p = argparse.ArgumentParser( - description="Outline ↔ Obsidian sync (Phase 1: init)", + description="Outline ↔ Obsidian sync", formatter_class=argparse.RawDescriptionHelpFormatter, epilog=( "Commands:\n" - " init Export Outline to vault and write git config files\n" + " init Export Outline to vault and write git config files\n" + " pull Fetch latest changes from remote Outline into vault\n" + " push Push local vault changes to remote Outline\n" + " compare Compare remote and local Outline instances (read-only)\n" ), ) - p.add_argument("command", choices=["init", "pull", "push"], help="Sync command") - p.add_argument("--vault", required=True, help="Path to vault directory") - p.add_argument("--settings", default="settings.json", help="Path to settings file") - p.add_argument("--url", help="Outline API URL (overrides settings.source.url)") - p.add_argument("--token", help="API token (overrides settings.source.token)") + p.add_argument("command", choices=["init", "pull", "push", "compare"], help="Sync command") + p.add_argument("--vault", required=True, help="Path to vault directory") + p.add_argument("--settings", default="settings.json", help="Path to settings file") + p.add_argument("--url", help="Remote Outline API URL (overrides settings.source.url)") + p.add_argument("--token", help="Remote API token (overrides settings.source.token)") + p.add_argument("--local-url", help="Local Outline API URL (overrides settings.local.url)") + p.add_argument("--local-token", help="Local API token (overrides settings.local.token)") p.add_argument( "-v", "--verbose", action="count", @@ -772,6 +904,7 @@ def main() -> None: settings = load_settings(args.settings) source = settings.get("source", {}) + local = settings.get("local", {}) url = args.url or source.get("url") token = args.token or source.get("token") @@ -795,6 +928,16 @@ def main() -> None: ok = sync.cmd_pull() elif args.command == "push": ok = sync.cmd_push() + elif args.command == "compare": + local_url = args.local_url or local.get("url") + local_token = args.local_token or local.get("token") + if not local_url or not local_token: + logger.error( + "Missing local API URL or token — set local.url and local.token " + "in settings.json, or pass --local-url / --local-token." + ) + sys.exit(1) + ok = sync.cmd_compare(local_url, local_token) else: ok = False sys.exit(0 if ok else 1)