Files
outline-sync/IMPORT_SCRIPT.MD
Claude d9161f64f5 Initial commit: Export tools and import script requirements
- export_with_trees.sh: Bash wrapper for Outline export
- outline_export_fixed.py: Python export implementation
- IMPORT_SCRIPT.MD: PRD for import script (to be built)
- RALPH_PROMPT.md: Ralph Loop prompt for building import script
- CLAUDE.md: Project documentation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 22:33:55 +01:00

18 KiB

Outline Import Script - Product Requirements Document

Document Version: 1.0 Created: 2026-01-17 Last Updated: 2026-01-19 Status: Draft


1. Executive Summary

Create import_to_outline.sh - a companion script to the existing export tool that imports markdown files back into Outline. The script restores documents with their full hierarchy using metadata preserved during export, enabling disaster recovery, migration between Outline instances, and content restoration workflows.


2. Problem Statement

Current State

  • Export functionality exists via export_with_trees.sh and outline_export_fixed.py
  • Exports include markdown content and _collection_metadata.json with full hierarchy
  • No automated way to restore or migrate exported content back into Outline

Pain Points

  1. Disaster Recovery: Manual recreation of collections and documents after data loss
  2. Migration: No tooling to move content between Outline instances
  3. Restore Workflow: Cannot selectively restore deleted documents or collections
  4. Testing: No way to verify export integrity via round-trip import

Business Impact

  • Hours of manual work to rebuild wiki structure after incidents
  • Risk of hierarchy/relationship loss during manual restoration
  • No confidence in backup validity without restore testing

3. Goals & Success Criteria

Primary Goals

  1. Restore exported markdown files to Outline with preserved hierarchy
  2. Support both full restore and selective import workflows
  3. Provide clear feedback on import progress and results

Success Criteria

Metric Target
Document import success rate ≥99%
Hierarchy accuracy 100% parent-child relationships preserved
Performance ≥10 documents/second
Dry-run accuracy 100% match between preview and actual import

Non-Goals

  • Image/attachment import (future enhancement)
  • Conflict resolution with existing content (skip or fail)
  • Real-time sync between instances
  • User/permission migration

4. User Stories

US-1: Disaster Recovery

As an administrator, I want to restore all collections from a backup so that I can recover from data loss.

Acceptance Criteria:

  • Import all collections from outline_export/ directory
  • Recreate exact hierarchy as shown in metadata
  • Report success/failure summary

US-2: Selective Restoration

As a user, I want to import a single collection so that I can restore specific content without affecting other data.

Acceptance Criteria:

  • Specify source directory containing single collection
  • Create collection if it doesn't exist
  • Skip import if collection already exists (configurable)

US-3: Migration to New Instance

As an administrator, I want to import all content into a fresh Outline instance so that I can migrate to new infrastructure.

Acceptance Criteria:

  • Works against empty Outline instance
  • Creates all collections and documents
  • Preserves document nesting structure

US-4: Safe Preview

As a user, I want to preview what will be imported so that I can verify before making changes.

Acceptance Criteria:

  • --dry-run flag shows all planned operations
  • No API calls that modify data during dry run
  • Output matches actual import behavior

US-5: Consolidated Import

As a user, I want to import multiple collections into a single new collection so that I can reorganize content during import.

Acceptance Criteria:

  • --single mode creates timestamped collection
  • Original collection names become top-level documents
  • All nested hierarchy preserved under these parents

5. Functional Requirements

5.1 Import Modes

Mode 1: Collection-per-Folder (Default)

outline_export/
├── Bewerbungen/     → Creates "Bewerbungen" collection
├── Projekte/        → Creates "Projekte" collection
└── Privat/          → Creates "Privat" collection

Behavior:

  • Each subdirectory in source becomes a separate collection
  • Collection names match folder names exactly
  • If collection exists: skip entire collection (default) or error

Mode 2: Single Collection (--single)

outline_export/
├── Bewerbungen/     → Becomes parent doc "Bewerbungen"
├── Projekte/        → Becomes parent doc "Projekte"
└── Privat/          → Becomes parent doc "Privat"

All imported into: "import_20260119_143052" collection

Behavior:

  • Creates one collection named import_YYYYMMDD_HHMMSS
  • Each original collection folder becomes a top-level parent document
  • Original document hierarchy nested under these parents

5.2 Command-Line Interface

./import_to_outline.sh [OPTIONS]

Options:
  -s, --single           Import all into single timestamped collection
  -n, --dry-run          Preview operations without making changes
  -d, --source DIR       Source directory (default: outline_export)
  -v, --verbose          Increase output verbosity (-vv for debug)
  -f, --force            Overwrite existing collections (instead of skip)
  --settings FILE        Path to settings file (default: settings.json)
  -h, --help             Show help message

5.3 Document Creation Logic

Hierarchy Reconstruction Algorithm

1. Load _collection_metadata.json
2. Build document tree from `documents` array (using parent_id)
3. Topological sort: ensure parents created before children
4. For each document in sorted order:
   a. Read markdown content from file
   b. Map old parent_id → new parent_id (from creation responses)
   c. Create document via API with parentDocumentId
   d. Store id mapping: old_id → new_id
5. Verify: created count matches expected count

ID Mapping Example

Export metadata:
  doc_A (id: abc-123, parent_id: null)
  doc_B (id: def-456, parent_id: abc-123)

After creating doc_A:
  id_map = { "abc-123": "new-789" }

Creating doc_B:
  parent_id = id_map["abc-123"] = "new-789"
  API call: create doc_B with parentDocumentId: "new-789"

5.4 Duplicate Handling

Scenario Default Behavior With --force
Collection exists Skip entire collection Delete and recreate
Document title exists in collection Skip document Update document

5.5 Error Handling

Error Type Behavior
API connection failure Abort with error message
Collection creation fails Abort that collection, continue others
Document creation fails Log error, continue with siblings
Missing markdown file Log warning, skip document
Invalid metadata JSON Abort that collection
Parent document not found Create as root-level document

6. Technical Design

6.1 Architecture

┌─────────────────────────────────────────────────────────────┐
│                    import_to_outline.sh                      │
│  (Bash wrapper - Docker execution, backup, UI)              │
└─────────────────────────┬───────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────┐
│                    outline_import.py                         │
│  (Python core - API calls, hierarchy logic)                 │
└─────────────────────────┬───────────────────────────────────┘
                          │
            ┌─────────────┼─────────────┐
            ▼             ▼             ▼
    ┌──────────────┐ ┌──────────┐ ┌──────────────┐
    │ settings.json│ │ metadata │ │ Outline API  │
    │ (API config) │ │  .json   │ │ (HTTP POST)  │
    └──────────────┘ └──────────┘ └──────────────┘

6.2 API Endpoints

Endpoint Method Purpose
/api/collections.list POST Check existing collections
/api/collections.create POST Create new collection
/api/collections.delete POST Delete collection (--force mode)
/api/documents.create POST Create document with content
/api/documents.list POST Check existing documents
/api/documents.update POST Update document (--force mode)

6.3 API Request Examples

Create Collection

POST /api/collections.create
{
  "name": "Bewerbungen",
  "permission": "read_write"
}

Create Document

POST /api/documents.create
{
  "collectionId": "col-uuid-here",
  "title": "DORA Metrics (Top 4)",
  "text": "# DORA Metrics\n\nContent here...",
  "parentDocumentId": "parent-uuid-or-null",
  "publish": true
}

6.4 Data Structures

Input: _collection_metadata.json

{
  "id": "original-collection-uuid",
  "name": "Bewerbungen",
  "directory": "Bewerbungen",
  "expected_count": 11,
  "documents": [
    {
      "id": "doc-uuid",
      "title": "Document Title",
      "filename": "Document Title.md",
      "parent_id": "parent-uuid-or-null",
      "checksum": "sha256-hash",
      "children": [...]
    }
  ]
}

Runtime: ID Mapping

id_map: Dict[str, str] = {
    "old-uuid-1": "new-uuid-1",
    "old-uuid-2": "new-uuid-2"
}

6.5 Docker Execution

docker run --rm --network domnet \
    --user "$(id -u):$(id -g)" \
    -e HOME=/tmp \
    -v "$WORK_DIR:/work" \
    -w /work \
    python:3.11-slim \
    bash -c "pip install -qqq requests 2>/dev/null && \
             python3 outline_import.py $CLI_ARGS"

7. User Interface

7.1 Progress Output

════════════════════════════════════════════════════════════
  OUTLINE IMPORT
════════════════════════════════════════════════════════════

Source:  outline_export/
Target:  http://outline:3000
Mode:    Collection per folder

Checking API connectivity... ✓

Bewerbungen/ (11 documents)
  Creating collection... ✓ (id: 7f3a...)
  ├── CV.md                              ✓ created
  ├── DORA Metrics (Top 4).md            ✓ created
  ├── Tipico.md                          ✓ created
  │   ├── Pitch Tipico.md                ✓ created
  │   ├── Fragen 3. Runde.md             ✓ created
  │   ├── Tipico 3rd Party.md            ✓ created
  │   └── Tipico Top 10 Functions.md     ✓ created
  └── Ihre PVS.md                        ✓ created
      ├── Mobilepass.md                  ✓ created
      ├── PVS erster Call.md             ✓ created
      └── Fragen Dirk.md                 ✓ created

Projekte/ (8 documents)
  Collection exists, skipping...

════════════════════════════════════════════════════════════
SUMMARY
════════════════════════════════════════════════════════════
  Collections:  1 created, 1 skipped, 0 errors
  Documents:   11 created, 0 skipped, 0 errors
  Duration:     2.3 seconds
════════════════════════════════════════════════════════════

7.2 Dry-Run Output

════════════════════════════════════════════════════════════
  OUTLINE IMPORT (DRY RUN)
════════════════════════════════════════════════════════════

Source:  outline_export/
Target:  http://outline:3000
Mode:    Collection per folder

[DRY RUN] No changes will be made

Bewerbungen/ (11 documents)
  [DRY RUN] Would create collection "Bewerbungen"
  [DRY RUN] Would create 11 documents:
    ├── CV.md
    ├── DORA Metrics (Top 4).md
    ├── Tipico.md
    │   ├── Pitch Tipico.md
    │   └── ...
    └── ...

Projekte/ (8 documents)
  [DRY RUN] Collection exists - would skip

════════════════════════════════════════════════════════════
DRY RUN SUMMARY
════════════════════════════════════════════════════════════
  Would create:  1 collection, 11 documents
  Would skip:    1 collection (exists)
════════════════════════════════════════════════════════════

7.3 Error Output

Bewerbungen/ (11 documents)
  Creating collection... ✓
  ├── CV.md                              ✓ created
  ├── Missing Doc.md                     ✗ file not found
  └── Tipico.md                          ✗ API error: 500
      └── (children skipped due to parent failure)

8. Configuration

8.1 settings.json (shared with export)

{
  "source": {
    "url": "http://outline:3000",
    "token": "ol_api_xxxxxxxxxxxx"
  },
  "import": {
    "source_directory": "outline_export",
    "on_collection_exists": "skip",
    "on_document_exists": "skip",
    "default_permission": "read_write"
  },
  "advanced": {
    "request_timeout": 30,
    "retry_attempts": 3,
    "retry_delay": 1.0,
    "rate_limit_delay": 0.1
  }
}

8.2 Configuration Options

Option Type Default Description
import.source_directory string outline_export Default source path
import.on_collection_exists enum skip skip, error, merge
import.on_document_exists enum skip skip, error, update
import.default_permission enum read_write read, read_write
advanced.request_timeout int 30 API timeout in seconds
advanced.retry_attempts int 3 Retries on failure
advanced.rate_limit_delay float 0.1 Delay between API calls

9. Testing Strategy

9.1 Test Cases

ID Category Test Case Expected Result
T1 Happy Path Import single collection Collection + docs created
T2 Happy Path Import multiple collections All collections created
T3 Happy Path Import nested hierarchy (3+ levels) All parent-child relationships preserved
T4 Duplicate Collection already exists Skip collection
T5 Duplicate Document title exists Skip document
T6 Error Missing markdown file Log warning, continue
T7 Error Invalid metadata JSON Abort collection
T8 Error API unreachable Abort with clear error
T9 Mode --single flag Single timestamped collection
T10 Mode --dry-run flag No API mutations
T11 Mode --force flag Overwrites existing
T12 Edge Empty collection Create empty collection
T13 Edge Special chars in title Handled correctly
T14 Edge Very large document Imported successfully

9.2 Verification Methods

  1. Round-trip Test: Export → Import to test instance → Export again → Compare checksums
  2. API Verification: Query created documents and verify parent relationships
  3. Manual Inspection: Spot-check imported content in Outline UI

10. Rollback & Recovery

Pre-Import Safety

  • No automatic backup (user should have export as backup)
  • --dry-run always available to preview

Rollback Procedure

If import fails partway through:

  1. Note which collections were created (from import log)
  2. Manually delete partial collections via Outline UI or API
  3. Fix issue and re-run import

Future Enhancement

  • --backup flag to export existing content before import
  • Transaction-like behavior: delete partially-imported collection on failure

11. Future Enhancements

Priority Enhancement Description
P1 Attachment support Import images and file attachments
P1 Merge mode Add documents to existing collections
P2 Selective import Import specific documents by path/pattern
P2 Update mode Update existing documents with new content
P3 User mapping Preserve authorship via user email mapping
P3 Permission sync Restore document-level permissions
P3 Incremental import Only import new/changed documents

12. Implementation Checklist

  • Create outline_import.py with core import logic
  • Create import_to_outline.sh bash wrapper
  • Implement collection creation
  • Implement document creation with hierarchy
  • Implement ID mapping for parent references
  • Add --dry-run mode
  • Add --single mode
  • Add --force mode
  • Add progress visualization
  • Add error handling and reporting
  • Add retry logic for API failures
  • Update settings.json schema
  • Write tests
  • Update CLAUDE.md documentation

13. References

  • Export Script: export_with_trees.sh, outline_export_fixed.py
  • Outline API Docs: https://www.getoutline.com/developers
  • Metadata Format: See outline_export/*/_collection_metadata.json
  • Settings Format: See settings.json