feat: add compare command for two-instance diff (Phase 1)
All checks were successful
Deploy / deploy (push) Successful in 13s

Adds read-only `compare` command that fetches both the remote Outline
instance (via Tailscale) and a local instance (outline-web container),
matches documents by canonical path key, and reports in_sync / remote_only /
local_only / conflict status. Also adds PRD for the full two-instance sync
workflow.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-18 23:35:47 +01:00
parent 6352fbb03f
commit 14af83a52a
2 changed files with 432 additions and 12 deletions

277
TWO_INSTANCE_SYNC_PRD.md Normal file
View File

@@ -0,0 +1,277 @@
# PRD: Two-Instance Outline Sync
## Overview
Extend outline-sync to support bidirectional comparison and synchronization between two live Outline instances: a **remote** instance (accessed via Tailscale) and a **local** instance (accessed directly). The goal is to bring both instances to identical content state after resolving all differences interactively.
---
## Current State
The current tool performs a **one-way pull** from the remote Outline instance into a local Git-backed vault (Obsidian-compatible markdown). This vault acts as a local representation of the remote instance's content.
---
## Goal
Enable a four-step workflow:
1. **Pull** from remote instance into local vault (existing)
2. **Fetch** from local Outline instance and detect differences vs. vault
3. **Review** differences interactively (left = remote, right = local)
4. **Resolve** all differences and push chosen versions to both instances
End state: both Outline instances contain identical content.
---
## Instances
| Side | Label | URL | Token |
|---|---|---|---|
| Left (Remote) | `remote` | `http://outline:3000` (Tailscale) | From `settings.json` |
| Right (Local) | `local` | `http://outline-web:3000` (Docker container) | `ol_api_PBU1EX6aRlUVkXzD995oGpMOmYOXpdXEWeEQll` |
---
## Document Matching Strategy
Since the two Outline instances assign independent document IDs, matching must be **by content path** — the canonical path is `Collection / Parent Chain / Document Title` derived from the navigation tree.
**Match key**: `collection_name / parent_title / ... / document_title` (case-insensitive, normalized)
**Match outcomes per document:**
| Outcome | Condition |
|---|---|
| `in_sync` | Match found, content and `updatedAt` identical |
| `remote_only` | Path exists on remote, not found on local |
| `local_only` | Path exists on local, not found on remote |
| `conflict` | Path exists on both, content differs |
---
## Step 2 — Fetch & Diff
### Process
1. Pull from remote into vault (existing Step 1).
2. Instantiate a second `OutlineSync` client for the local instance.
3. Fetch all collections + document trees from local instance.
4. For each document on either side, compute the match key.
5. Build a **diff report**: a structured list of `DiffEntry` objects.
### DiffEntry Model
```python
@dataclass
class DiffEntry:
match_key: str # Canonical path used for matching
status: Literal["in_sync", "remote_only", "local_only", "conflict"]
remote_doc: DocMeta | None # None if remote_only is False
local_doc: DocMeta | None # None if local_only is False
@dataclass
class DocMeta:
instance: str # "remote" | "local"
doc_id: str
title: str
collection_id: str
collection_name: str
parent_id: str | None
updated_at: str # ISO timestamp
content: str # Full markdown body
```
### CLI Output (Step 2)
```
Fetching remote... ✓ (47 documents in 6 collections)
Fetching local... ✓ (39 documents in 5 collections)
Diff summary:
In sync: 35 documents
Remote only: 8 documents
Local only: 4 documents
Conflicts: 8 documents
─────────────────────────────
Total to resolve: 20
```
---
## Step 3 — Interactive Conflict Resolution
### Interface Options
Two interfaces will be supported:
#### A. Terminal (CLI) — Primary
An interactive terminal UI (using `rich` or plain TTY prompts) presents each non-`in_sync` document one at a time.
**Display per item:**
```
[8/20] CONFLICT — Projekte / outline-sync / Architecture
LEFT (remote) updated 2026-03-17T09:12:00Z
RIGHT (local) updated 2026-03-15T14:30:00Z
--- remote
+++ local
@@ -3,7 +3,9 @@
## Overview
-The sync tool uses Git as a local cache.
+The sync tool uses Git as a local cache and Tailscale for connectivity.
+
+Added section on networking.
Choose action:
[r] Keep remote [l] Keep local [s] Skip [d] Full diff [q] Quit
```
**For `remote_only`:**
```
[3/20] REMOTE ONLY — Bewerbungen / Company XYZ / Cover Letter
Document exists on remote, not on local.
Choose: [r] Copy to local [s] Skip [q] Quit
```
**For `local_only`:**
```
[5/20] LOCAL ONLY — Projekte / New Project / Notes
Document exists on local, not on remote.
Choose: [l] Copy to remote [s] Skip [q] Quit
```
#### B. Web UI — Optional (Phase 2)
A web page with a split-panel diff view (similar to a code review UI) with Accept Left / Accept Right / Edit buttons. Built on the existing FastAPI + HTMX stack.
---
## Step 4 — Resolution & Write-Back
After the user has resolved all items, the tool executes the chosen actions:
| User Choice | Action |
|---|---|
| Keep remote | Push remote content to local Outline via API (`documents.create` or `documents.update`) |
| Keep local | Push local content to remote Outline via API |
| Copy to local | Create document on local instance with remote content |
| Copy to remote | Create document on remote instance with local content |
| Skip | No action for this document |
### Write-Back Rules
- **Collection must exist** on target before creating a document. Auto-create collection if missing.
- **Parent document must exist** on target before creating a child. Create parents recursively (depth-first).
- **Title** and **markdown body** are copied as-is. Frontmatter (`outline_id` etc.) is not transferred — target instance assigns its own IDs.
- **Updated timestamps** are NOT forcibly set — the target instance sets `updatedAt` on create/update.
### Final Report
```
Applying resolutions...
✓ Copied to local: 8 documents
✓ Copied to remote: 4 documents
✓ Pushed to local: 3 documents (conflict: chose remote)
✓ Pushed to remote: 5 documents (conflict: chose local)
- Skipped: 0 documents
Both instances now have 47 documents in 6 collections.
```
---
## Configuration
Add a `local` block to `settings.json`:
```json
{
"source": {
"url": "http://outline:3000",
"token": "ol_api_..."
},
"local": {
"url": "http://outline-web:3000",
"token": "ol_api_PBU1EX6aRlUVkXzD995oGpMOmYOXpdXEWeEQll"
}
}
```
CLI flags override settings:
```bash
./sync.sh compare --local-url http://localhost:3000 --local-token ol_api_...
./sync.sh compare --auto-resolve newer # Auto-pick newer timestamp, no prompts
```
---
## New CLI Commands
| Command | Description |
|---|---|
| `compare` | Fetch both instances and print diff report (no writes) |
| `merge` | Interactive resolution workflow (Steps 24) |
| `merge --auto-resolve newer` | Non-interactive: always pick newer `updatedAt` |
| `merge --auto-resolve remote` | Non-interactive: always keep remote version |
| `merge --auto-resolve local` | Non-interactive: always keep local version |
---
## Non-Goals
- **Attachment/image sync** — text content only (consistent with existing NG3)
- **Real-time continuous sync** — this is a manual, on-demand operation
- **Three-way merge** — no character-level content merging; user picks one side
- **Deletion propagation** — documents deleted on one side are not auto-deleted on the other; they appear as `remote_only` or `local_only`
- **Comment/reaction sync** — document body only
- **Preserving Outline metadata** (pinned, starred, views) — content only
---
## Implementation Phases
### Phase 1 — Compare (Step 2 only)
- Add `local` config block
- Implement `LocalOutlineClient` (reuse `OutlineSync` API methods, different credentials)
- Build `DiffReport`: fetch both instances, match by path, classify entries
- `compare` command: print diff summary to terminal
- No writes to either instance
### Phase 2 — Interactive CLI Merge (Steps 34)
- `merge` command with TTY prompts
- `--auto-resolve` flag for non-interactive use
- Write-back: `documents.create` / `documents.update` on target
- Recursive parent creation
- Final report
### Phase 3 — Web UI Diff View (Optional)
- Split-panel view in FastAPI + HTMX
- Accept Left / Accept Right buttons per document
- Progress indicator during write-back
---
## Success Criteria
- `compare` correctly identifies all differences between two live Outline instances
- `merge` allows user to resolve every difference and both instances end up with identical document trees
- No data is lost — skipped documents remain on their original instance
- The tool is re-runnable: running `compare` after a successful `merge` reports 0 differences
---
## Decisions
1. **Rename detection**: Treated as `remote_only` + `local_only` (delete + create). No fuzzy matching.
2. **Collection renames**: Same — treated as entirely separate collections with all-new documents.
3. **Local instance**: Docker container `outline-web`, accessible at `http://outline-web:3000`.
4. **Initial scope**: Phase 1 only (`compare` command, read-only). Phases 23 follow after matching logic is validated.

View File

@@ -178,15 +178,18 @@ class OutlineSync:
logger.error("Request failed on %s: %s", endpoint, exc) logger.error("Request failed on %s: %s", endpoint, exc)
return None return None
def health_check(self) -> bool: def health_check(self, label: str = "") -> bool:
parsed = urlparse(self.base_url) parsed = urlparse(self.base_url)
host = parsed.hostname or "outline" host = parsed.hostname or "outline"
port = parsed.port or (443 if parsed.scheme == "https" else 80) port = parsed.port or (443 if parsed.scheme == "https" else 80)
print(f"Checking API connectivity to {self.base_url} ...") prefix = f"[{label}] " if label else ""
print(f"{prefix}Checking API connectivity to {self.base_url} ...")
indent = " "
# 1. DNS resolution # 1. DNS resolution
print(f" DNS resolve {host!r} ... ", end="", flush=True) print(f"{indent}DNS resolve {host!r} ... ", end="", flush=True)
try: try:
ip = socket.gethostbyname(host) ip = socket.gethostbyname(host)
print(f"✓ ({ip})") print(f"✓ ({ip})")
@@ -195,7 +198,7 @@ class OutlineSync:
return False return False
# 2. TCP reachability # 2. TCP reachability
print(f" TCP connect {ip}:{port} ... ", end="", flush=True) print(f"{indent}TCP connect {ip}:{port} ... ", end="", flush=True)
try: try:
with socket.create_connection((host, port), timeout=5): with socket.create_connection((host, port), timeout=5):
print("") print("")
@@ -204,7 +207,7 @@ class OutlineSync:
return False return False
# 3. API authentication # 3. API authentication
print(f" API auth ... ", end="", flush=True) print(f"{indent}API auth ... ", end="", flush=True)
result = self._api("/api/auth.info") result = self._api("/api/auth.info")
if result and "data" in result: if result and "data" in result:
user = result["data"].get("user", {}) user = result["data"].get("user", {})
@@ -666,6 +669,130 @@ class OutlineSync:
print(f"Done. {summary}.") print(f"Done. {summary}.")
return errors == 0 return errors == 0
# ── Compare ───────────────────────────────────────────────────────────────
def _collect_docs_keyed(
self,
nodes: List[Dict],
coll_name: str,
result: Dict[str, Dict],
parent_path: str = "",
) -> None:
"""Recursively populate result with {match_key: doc_meta} from a nav tree."""
for node in nodes:
title = node.get("title", "Untitled")
safe_title = sanitize_name(title).lower()
safe_coll = sanitize_name(coll_name).lower()
path = f"{parent_path}/{safe_title}" if parent_path else safe_title
match_key = f"{safe_coll}/{path}"
result[match_key] = {
"id": node["id"],
"title": title,
"updatedAt": node.get("updatedAt", ""),
}
for child in node.get("children", []):
self._collect_docs_keyed([child], coll_name, result, path)
def _fetch_all_docs_keyed(self) -> Dict[str, Dict]:
"""Return {match_key: doc_meta} for every document across all collections."""
out: Dict[str, Dict] = {}
for coll in self.get_collections():
tree = self.get_nav_tree(coll["id"])
self._collect_docs_keyed(tree, coll["name"], out)
return out
def cmd_compare(self, local_url: str, local_token: str) -> bool:
"""
Fetch both the remote and local Outline instances and print a diff report.
No writes to either instance.
Match key: collection_name/parent_chain/document_title (lowercased, sanitized)
Status per document:
in_sync — same path, content identical
remote_only — path exists on remote, not on local
local_only — path exists on local, not on remote
conflict — same path, content differs
"""
print("Fetching remote instance...")
if not self.health_check(label="remote"):
print("✗ Cannot reach remote Outline API — aborting.")
return False
remote_docs = self._fetch_all_docs_keyed()
print(f"{len(remote_docs)} documents\n")
print("Fetching local instance (outline-web)...")
local_client = OutlineSync(local_url, local_token, self.vault_dir)
if not local_client.health_check(label="local"):
print("✗ Cannot reach local Outline API — aborting.")
return False
local_docs = local_client._fetch_all_docs_keyed()
print(f"{len(local_docs)} documents\n")
all_keys = sorted(set(remote_docs) | set(local_docs))
results: Dict[str, List] = {
"in_sync": [],
"remote_only": [],
"local_only": [],
"conflict": [],
}
for key in all_keys:
r = remote_docs.get(key)
l = local_docs.get(key)
if r and l:
if r["updatedAt"] == l["updatedAt"]:
results["in_sync"].append(key)
else:
# Timestamps differ — compare actual content
r_full = self.get_document_info(r["id"])
l_full = local_client.get_document_info(l["id"])
r_text = (r_full or {}).get("text", "")
l_text = (l_full or {}).get("text", "")
if r_text.strip() == l_text.strip():
results["in_sync"].append(key)
else:
results["conflict"].append((key, r, l))
elif r:
results["remote_only"].append(key)
else:
results["local_only"].append(key)
in_sync = len(results["in_sync"])
remote_only = len(results["remote_only"])
local_only = len(results["local_only"])
conflicts = len(results["conflict"])
total = remote_only + local_only + conflicts
print("Diff summary:")
print(f" In sync: {in_sync:4d} documents")
print(f" Remote only: {remote_only:4d} documents")
print(f" Local only: {local_only:4d} documents")
print(f" Conflicts: {conflicts:4d} documents")
print( " " + "" * 30)
print(f" Total to resolve: {total}")
if results["remote_only"]:
print("\nRemote only:")
for key in results["remote_only"]:
print(f" + {key}")
if results["local_only"]:
print("\nLocal only:")
for key in results["local_only"]:
print(f" + {key}")
if results["conflict"]:
print("\nConflicts:")
for key, r, l in results["conflict"]:
r_ts = r["updatedAt"][:19] if r["updatedAt"] else "?"
l_ts = l["updatedAt"][:19] if l["updatedAt"] else "?"
print(f" ~ {key}")
print(f" remote: {r_ts} local: {l_ts}")
return True
# ── Commands ────────────────────────────────────────────────────────────── # ── Commands ──────────────────────────────────────────────────────────────
def cmd_init(self) -> bool: def cmd_init(self) -> bool:
@@ -741,18 +868,23 @@ def load_settings(path: str) -> Dict:
def parse_args() -> argparse.Namespace: def parse_args() -> argparse.Namespace:
p = argparse.ArgumentParser( p = argparse.ArgumentParser(
description="Outline ↔ Obsidian sync (Phase 1: init)", description="Outline ↔ Obsidian sync",
formatter_class=argparse.RawDescriptionHelpFormatter, formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=( epilog=(
"Commands:\n" "Commands:\n"
" init Export Outline to vault and write git config files\n" " init Export Outline to vault and write git config files\n"
" pull Fetch latest changes from remote Outline into vault\n"
" push Push local vault changes to remote Outline\n"
" compare Compare remote and local Outline instances (read-only)\n"
), ),
) )
p.add_argument("command", choices=["init", "pull", "push"], help="Sync command") p.add_argument("command", choices=["init", "pull", "push", "compare"], help="Sync command")
p.add_argument("--vault", required=True, help="Path to vault directory") p.add_argument("--vault", required=True, help="Path to vault directory")
p.add_argument("--settings", default="settings.json", help="Path to settings file") p.add_argument("--settings", default="settings.json", help="Path to settings file")
p.add_argument("--url", help="Outline API URL (overrides settings.source.url)") p.add_argument("--url", help="Remote Outline API URL (overrides settings.source.url)")
p.add_argument("--token", help="API token (overrides settings.source.token)") p.add_argument("--token", help="Remote API token (overrides settings.source.token)")
p.add_argument("--local-url", help="Local Outline API URL (overrides settings.local.url)")
p.add_argument("--local-token", help="Local API token (overrides settings.local.token)")
p.add_argument( p.add_argument(
"-v", "--verbose", "-v", "--verbose",
action="count", action="count",
@@ -772,6 +904,7 @@ def main() -> None:
settings = load_settings(args.settings) settings = load_settings(args.settings)
source = settings.get("source", {}) source = settings.get("source", {})
local = settings.get("local", {})
url = args.url or source.get("url") url = args.url or source.get("url")
token = args.token or source.get("token") token = args.token or source.get("token")
@@ -795,6 +928,16 @@ def main() -> None:
ok = sync.cmd_pull() ok = sync.cmd_pull()
elif args.command == "push": elif args.command == "push":
ok = sync.cmd_push() ok = sync.cmd_push()
elif args.command == "compare":
local_url = args.local_url or local.get("url")
local_token = args.local_token or local.get("token")
if not local_url or not local_token:
logger.error(
"Missing local API URL or token — set local.url and local.token "
"in settings.json, or pass --local-url / --local-token."
)
sys.exit(1)
ok = sync.cmd_compare(local_url, local_token)
else: else:
ok = False ok = False
sys.exit(0 if ok else 1) sys.exit(0 if ok else 1)