16 KiB
AGENTS.md
This file provides guidance to AI agents (Claude, GPT-4, etc.) when working with this Outline export/import tool repository.
For all technical details, architecture info, common pitfalls, and code patterns, see this file.
Quick Reference
Primary Language: Python 3.11 + Bash
Key Dependencies: requests, tqdm
Runtime: Docker containers on domnet network
API Base: http://outline:3000 (internal, bypasses SSO)
Key Features:
- Export all collections with full document hierarchy
- Import back to Outline preserving structure
- Automatic backups with 90%+ compression
- Dry-run mode for safe testing
- Retry logic for API reliability
Usage
Export (Backup)
# Run the export with tree visualization
./export_with_trees.sh
# Preview without exporting (dry run)
./export_with_trees.sh --dry-run
# Run with verbose output
./export_with_trees.sh -v
Export CLI Options:
--dry-run, -n Preview what would be exported without writing files
--output, -o DIR Output directory (overrides settings.json)
--verbose, -v Increase verbosity (-vv for debug)
--skip-verify Skip post-export verification
--skip-health-check Skip pre-export health check
--settings FILE Path to settings file (default: settings.json)
Import (Restore)
# Import all collections from outline_export/
./import_to_outline.sh
# Preview what would be imported (no changes made)
./import_to_outline.sh --dry-run
# Import into a single timestamped collection
./import_to_outline.sh --single
# Import from a different directory
./import_to_outline.sh -d exports/
# Overwrite existing collections
./import_to_outline.sh --force
Import CLI Options:
-s, --single Import all into single timestamped collection
-n, --dry-run Preview operations without making changes
-d, --source DIR Source directory (default: outline_export)
-v, --verbose Increase verbosity (-vv for debug)
-f, --force Overwrite existing collections (instead of skip)
--settings FILE Path to settings file (default: settings.json)
-h, --help Show help message
Running Python Scripts Directly
If you need to run the Python scripts directly (e.g., for debugging):
# Export
docker run --rm --network domnet \
-v "$(pwd):/work" \
-w /work \
python:3.11-slim \
bash -c "pip install -q requests tqdm && python3 outline_export_fixed.py --dry-run"
# Import
docker run --rm --network domnet \
-v "$(pwd):/work" \
-w /work \
python:3.11-slim \
bash -c "pip install -q requests tqdm && python3 outline_import.py --dry-run"
Note: The shell wrappers (export_with_trees.sh, import_to_outline.sh) provide better UX with tree visualization and colored output.
Agent Operating Guidelines
1. Configuration
Settings are in settings.json:
{
"source": {
"url": "http://outline:3000",
"token": "your-api-token-here"
},
"export": {
"output_directory": "outline_export"
},
"advanced": {
"max_hierarchy_depth": 100
}
}
Important: settings.json contains secrets (API token) and should never be committed to git.
2. Architecture Understanding
This tool operates in a Docker-isolated environment to bypass Authentik SSO:
- All Python scripts run inside ephemeral Docker containers
- Network:
domnetbridge allows direct access to Outline's internal API - No persistent container state - dependencies installed on each run
Critical Context:
- The
http://outline:3000URL only works inside the Docker network - External access would require SSO authentication through Authentik
- This design is intentional for automated backup/restore operations
Export Flow
- Health Check: Verify API connectivity
- Fetch Collections: Via
/api/collections.list - Build Tree: Get navigation tree via
/api/collections.documents(source of truth for hierarchy) - Fetch Content: Full document content via
/api/documents.info(with caching) - Export Recursively: Maintain parent-child structure
- Save Metadata:
_collection_metadata.jsonper collection - Create Backup: Archive previous export to
outline_backup_*.tar.gz - Verify: Generate manifest with checksums
Import Flow
- Health Check: Verify API connectivity
- Load Metadata: Read
_collection_metadata.jsonfrom each collection directory - Build Tree: Reconstruct document hierarchy from metadata
- Create Collections: Via
/api/collections.create - Create Documents: Via
/api/documents.createwith properparentDocumentId - Map IDs: Track old IDs → new IDs to maintain hierarchy
- Display Progress: Tree-style output with status indicators
Core Components Pipelines
Export Pipeline:
export_with_trees.sh → Docker container → outline_export_fixed.py
↓
Fetches collections → Builds document tree → Exports markdown + metadata
↓
Creates backup → Verifies integrity → Displays summary
Import Pipeline:
import_to_outline.sh → Docker container → outline_import.py
↓
Reads metadata → Validates structure → Creates collections
↓
Uploads documents → Maintains hierarchy → Reports status
3. Import Modes
Each subdirectory becomes a separate collection:
outline_export/
├── Bewerbungen/ → Creates "Bewerbungen" collection
├── Projekte/ → Creates "Projekte" collection
└── Privat/ → Creates "Privat" collection
Single Collection (--single)
All content goes into one timestamped collection:
outline_export/
├── Bewerbungen/ → Becomes parent doc "Bewerbungen"
├── Projekte/ → Becomes parent doc "Projekte"
└── Privat/ → Becomes parent doc "Privat"
All imported into: "import_20260119_143052" collection
4. Behavior & Duplicate Handling
Duplicate Handling
| Scenario | Default Behavior | With --force |
|---|---|---|
| Collection exists | Skip entire collection | Delete and recreate |
| Document exists | Skip document | Update document |
Error Handling
Import Errors:
- API connection failure: Abort with error message
- Collection creation fails: Abort that collection, continue others
- Document creation fails: Log error, continue with siblings
- Missing markdown file: Log warning, skip document
- Parent not found: Create as root-level document
Export Errors:
- API connection failure: Abort before starting
- Collection fetch fails: Skip that collection, continue
- Document fetch fails: Retry 3x with backoff, then skip
- Disk write fails: Abort with error message
Rate Limiting
If Outline API returns 429 errors:
- Automatic retry with exponential backoff
- Up to 3 retry attempts per request
- Configurable delay between retries
Important Features & Behaviors
Backup System:
- Each export automatically backs up previous exports to
outline_backup_YYYYMMDD_HHMMSS.tar.gz - Old uncompressed export directory is deleted after backup
- Backups achieve 90%+ compression on markdown content
- Safe to re-run exports - previous data is always preserved
Reliability Features:
- Health check: Verifies API connectivity before operations
- Retry logic: Failed API requests retry up to 3 times with exponential backoff
- Caching: Document content cached during single run to reduce API calls
- Logging: Structured logging with configurable verbosity levels (-v, -vv)
Hierarchy Integrity:
- The navigation tree (
/api/collections.documents) is the source of truth for document hierarchy - Import maintains parent-child relationships via
parentDocumentIdmapping - Document counting is recursive to include all nested children
- Maximum depth limit (default: 100) prevents infinite recursion
5. File Structure Knowledge
outline-tools/
├── export_with_trees.sh # Main export entrypoint
#### Dry Run Testing
```bash
# Test export without writing files
./export_with_trees.sh --dry-run
# Test import without creating collections
./import_to_outline.sh --dry-run
Verification Checklist
- Health check passes before export/import
- Document count matches (compare tree output)
- Hierarchy preserved (check parent-child relationships)
- Metadata files valid JSON
- No API errors in logs
- Backup created successfully (export only)
8. Troubleshooting & Debug Mode
Common Issues
"Connection refused" or "Name resolution failed"
- Cause: Not running inside
domnetDocker network - Solution: Always use shell wrappers (
export_with_trees.sh,import_to_outline.sh)
"Authorization failed" or 401/403 errors
- Cause: Invalid or expired API token
- Solution: Update token in
settings.json
Documents appear at wrong hierarchy level after import
- Cause: Metadata corruption or
parentDocumentIdmapping issue - Solution: Re-export, verify
_collection_metadata.jsonintegrity, checkid_mappingdictionary
Import creates duplicate collections
- Cause: Collection names differ (case, spaces, special chars)
- Solution: Use
--forceto replace, or manually delete old collections
API returns 429 errors
- Cause: Rate limiting from too many API requests
- Solution: Built-in retry logic handles this - increase
RETRY_DELAYif persistent
Debug Mode
Run with -vv for detailed debug output:
./export_with_trees.sh -vv
./import_to_outline.sh -vv
This shows:
- Full API requests and responses
- Document ID mappings
- File operations
- Retry attempts
Quick Diagnostics
# Test API connectivity
curl -H "Authorization: Bearer $TOKEN" http://outline:3000/api/collections.list
# Check Docker network
docker network inspect domnet
# Run with verbose logging
./export_with_trees.sh -vv
9. Extending the Tool
Adding New CLI Options
Bash wrapper (export_with_trees.sh):
# Add option parsing
while [[ $# -gt 0 ]]; do
case $1 in
--my-option)
MY_OPTION="$2"
shift 2
;;
Python script (outline_export_fixed.py):
# Add argument parser
parser.add_argument('--my-option', help='Description')
# Pass to Docker
docker_cmd="... python3 outline_export_fixed.py $@"
Adding New Export Formats
- Create format converter function in
outline_export_fixed.py - Add format option to CLI
- Modify
write_document_to_file()to call converter - Update metadata to track format
Custom Filtering
Add filter configuration to settings.json:
{
"export": {
"filters": {
"exclude_tags": ["draft", "private"],
"include_collections": ["Public", "Docs"]
}
}
}
Then implement in OutlineExporter.should_export_document().
10. Error Recovery
Partial Export Recovery
If export crashes mid-run:
- Previous export is already backed up (if existed)
- Partial export in
outline_export/may be incomplete - Safe to re-run - will overwrite partial data
- Check
manifest.jsonto see what completed
Failed Import Recovery
If import fails partway:
- Successfully created collections remain in Outline
- Use
--forceto delete and retry, OR - Manually delete collections from Outline UI
- Check logs for document ID where failure occurred
11. Performance Optimization
Reducing API Calls
- Caching: Document content cached during single run
- Batching: Not currently implemented (future enhancement)
- Parallelization: Not safe due to Outline API rate limits
Faster Exports
- Skip verification:
--skip-verify - Skip health check:
--skip-health-check(risky) - Reduce hierarchy depth: Adjust
max_hierarchy_depthin settings
Faster Imports
- Single collection mode:
--single(fewer collection creates) - Disable verbose logging (default)
12. Security Considerations
Secrets Management
settings.jsoncontains API token- Never log the token value
- Never commit
settings.jsonto git - Backups may contain sensitive content
Safe Practices
# Check git status before committing
git status
# Verify settings.json is ignored
grep settings.json .gitignore
# Sanitize logs before sharing
sed 's/Bearer [A-Za-z0-9_-]*/Bearer [REDACTED]/g' logs.txt
13. Common Agent Mistakes to Avoid
- Don't suggest running Python directly - Always use Docker wrappers
- Don't hardcode the API URL - It's environment-specific (use settings.json)
- Don't assume external API access - Only works inside
domnet - Don't ignore dry-run mode - Always test changes with
--dry-runfirst - Don't modify hierarchy logic lightly - Parent-child relationships are fragile
- Don't skip error handling - API can fail intermittently
- Don't forget to update both export and import - Changes often affect both sides
14. Useful Code Patterns
Making Authenticated API Calls
headers = {
"Authorization": f"Bearer {self.api_token}",
"Content-Type": "application/json"
}
response = requests.post(
f"{self.api_url}/api/endpoint",
json=payload,
headers=headers,
timeout=30
)
response.raise_for_status()
data = response.json()
Recursive Tree Traversal
def process_tree(node, parent_id=None):
doc_id = node["id"]
process_document(doc_id, parent_id)
for child in node.get("children", []):
process_tree(child, doc_id)
Progress Display with tqdm
from tqdm import tqdm
with tqdm(total=total_docs, desc="Exporting") as pbar:
for doc in documents:
process(doc)
pbar.update(1)
15. When to Ask for Clarification
Ask the user if:
- They want to modify API authentication method
- They need to export to a different Outline instance
- They want to filter by specific criteria not in settings
- They experience persistent API errors (might be Outline-specific issue)
- They need to handle very large wikis (>10,000 documents)
- They want to schedule automated backups (needs cron/systemd setup)
16. Recommended Improvements (Future)
Ideas for enhancing the tool:
- Incremental exports: Only export changed documents
- Parallel imports: Speed up large imports (carefully!)
- Format converters: Export to Notion, Confluence, etc.
- Diff tool: Compare exported versions
- Search index: Build searchable archive
- Version history: Track document changes over time
Quick Decision Tree
User wants to modify the tool:
├─ Change export filtering? → Edit outline_export_fixed.py
├─ Change import behavior? → Edit outline_import.py
├─ Add CLI option? → Edit .sh wrapper + .py script
├─ Change output format? → Edit write_document_to_file()
├─ Fix API error? → Check retry logic and error handling
└─ Add new feature? → Review both export and import sides
User reports an error:
├─ Connection refused? → Check Docker network
├─ Auth error? → Verify API token in settings.json
├─ Hierarchy wrong? → Check id_mapping in import
├─ Missing documents? → Compare counts, check filters
└─ JSON error? → Validate metadata files
User wants to understand:
├─ How it works? → Refer to CLAUDE.md
├─ How to use? → Show CLI examples
├─ How to extend? → Point to sections 9-10 above
└─ How to troubleshoot? → Use section 8 checklist
Additional Resources
- Outline API Docs: https://www.getoutline.com/developers
- Python requests: https://requests.readthedocs.io/
- Docker networks: https://docs.docker.com/network/
- tqdm progress bars: https://tqdm.github.io/
Agent Self-Check
Before suggesting changes:
- Have I read the architecture section?
- Do I understand the Docker network requirement?
- Have I considered both export and import sides?
- Will my change maintain hierarchy integrity?
- Have I suggested testing with --dry-run?
- Have I checked for security implications?
- Is my suggestion compatible with Docker execution?