Files
ciaovolo/flight-comparator/CLAUDE.md
2026-02-26 17:16:28 +01:00

862 lines
27 KiB
Markdown

# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Workflow Rules
- **After every successful test run, commit and push.** Stage only the files changed for that task, write a concise commit message describing what was fixed or added, then `git push`. This keeps the remote in sync with every verified milestone.
## Project Overview
This repository contains **two applications**:
1. **Flight Airport Comparator CLI** - Python CLI tool for flight comparisons
2. **Flight Radar Web App** - Full-stack web application with REST API, React frontend, and Docker deployment
### CLI Tool
A Python CLI tool that compares direct flights from multiple airports in a country to a single destination, using Google Flights data via the fast-flights library.
**Core question it answers:** "I want to fly to [DESTINATION]. Which airport in [COUNTRY] should I depart from — and when in the next 6 months does the best route open up?"
### Web Application
A production-ready web application providing:
- REST API (FastAPI) with rate limiting, validation, and error handling
- React + TypeScript frontend with real-time updates
- SQLite database with automatic schema migrations
- Docker deployment with health checks
- 43 passing tests with 75% code coverage
## Web Application Architecture
### Tech Stack
**Backend:**
- FastAPI 0.104+ with Pydantic v2 for validation
- SQLite database with foreign keys enabled
- Uvicorn ASGI server
- Python 3.11+
**Frontend:**
- React 19 with TypeScript (strict mode)
- Vite 7 for build tooling
- Tailwind CSS v4 with @tailwindcss/postcss
- React Router v7 for client-side routing
- Axios for API requests
**Infrastructure:**
- Docker multi-stage builds
- Docker Compose orchestration
- Nginx reverse proxy for production
- Volume persistence for database
### Web App File Structure
```
flight-comparator/
├── api_server.py # FastAPI app (1,300+ lines)
├── database/
│ ├── __init__.py # Connection utilities
│ ├── init_db.py # Schema initialization
│ └── schema.sql # Database schema (scans, routes tables)
├── frontend/
│ ├── src/
│ │ ├── api.ts # Type-safe API client (308 lines)
│ │ ├── components/ # React components
│ │ │ ├── Layout.tsx
│ │ │ ├── AirportSearch.tsx
│ │ │ ├── ErrorBoundary.tsx
│ │ │ ├── Toast.tsx
│ │ │ └── LoadingSpinner.tsx
│ │ └── pages/ # Page components
│ │ ├── Dashboard.tsx
│ │ ├── Scans.tsx
│ │ ├── ScanDetails.tsx
│ │ ├── Airports.tsx
│ │ └── Logs.tsx
│ ├── package.json
│ └── vite.config.ts # Vite config with API proxy
├── tests/
│ ├── conftest.py # Pytest fixtures
│ ├── test_api_endpoints.py # 26 unit tests
│ └── test_integration.py # 15 integration tests
├── Dockerfile.backend # Python backend container
├── Dockerfile.frontend # Node + Nginx container
├── docker-compose.yml # Service orchestration
└── nginx.conf # Nginx configuration
Total: ~3,300 lines of production code
```
### Database Schema
**Table: scans**
- Tracks scan requests with status (pending → running → completed/failed)
- Foreign keys enabled with CASCADE deletes
- CHECK constraints for IATA codes (3 chars) and ISO country codes (2 chars)
- Auto-updated timestamps via triggers
- Indexes on `(origin, country)`, `status`, and `created_at`
**Table: routes**
- Stores discovered routes per scan (foreign key to scans.id)
- Flight statistics: min/max/avg price, flight count, airlines array (JSON)
- Composite index on `(scan_id, min_price)` for sorted queries
**Views:**
- `scan_statistics` - Aggregated stats per scan
- `recent_scans` - Last 100 scans with route counts
### API Architecture (api_server.py)
**Key Classes:**
1. **LogBuffer + BufferedLogHandler** (lines 48-100)
- Thread-safe circular buffer for application logs
- Custom logging handler that stores logs in memory
- Supports filtering by level and search
2. **RateLimiter** (lines 102-150)
- Sliding window rate limiting per endpoint per IP
- Independent tracking for each endpoint
- X-Forwarded-For support for proxy setups
- Rate limit headers on all responses
3. **Pydantic Models** (lines 152-300)
- Input validation with auto-normalization (lowercase → uppercase)
- Custom validators for IATA codes (3 chars), ISO codes (2 chars), dates
- Generic PaginatedResponse[T] model for consistent pagination
- Detailed validation error messages
**API Endpoints:**
| Method | Path | Purpose | Rate Limit |
|--------|------|---------|------------|
| GET | `/health` | Health check | No limit |
| GET | `/api/v1/airports` | Search airports | 100/min |
| POST | `/api/v1/scans` | Create scan | 10/min |
| GET | `/api/v1/scans` | List scans | 30/min |
| GET | `/api/v1/scans/{id}` | Get scan details | 30/min |
| GET | `/api/v1/scans/{id}/routes` | Get routes | 30/min |
| GET | `/api/v1/logs` | View logs | 30/min |
**Middleware Stack:**
1. Request ID middleware (UUID per request)
2. CORS middleware (configurable origins via `ALLOWED_ORIGINS` env var)
3. Rate limiting middleware (per-endpoint per-IP)
4. Custom exception handlers (validation, HTTP, general)
**Startup Logic:**
- Downloads airport data from OpenFlights
- Initializes database schema
- Detects and fixes stuck scans (status=running with no update > 1 hour)
- Enables SQLite foreign keys globally
### Frontend Architecture
**Routing:**
- `/` - Dashboard with stats cards and recent scans
- `/scans` - Create new scan form
- `/scans/:id` - View scan details and routes table
- `/airports` - Search airport database
- `/logs` - Application log viewer
**State Management:**
- Local component state with React hooks (useState, useEffect)
- No global state library (Redux, Context) - API is source of truth
- Optimistic UI updates with error rollback
**API Client Pattern (src/api.ts):**
```typescript
// Type-safe interfaces for all API responses
export interface Scan { id: number; origin: string; ... }
export interface Route { id: number; destination: string; ... }
// Organized by resource
export const scanApi = {
list: (page, limit, status?) => api.get<PaginatedResponse<Scan>>(...),
create: (data) => api.post<CreateScanResponse>(...),
get: (id) => api.get<Scan>(...),
routes: (id, page, limit) => api.get<PaginatedResponse<Route>>(...)
};
```
**Error Handling:**
- ErrorBoundary component catches React errors
- Toast notifications for user feedback (4 types: success, error, info, warning)
- LoadingSpinner for async operations
- Graceful fallbacks for missing data
**TypeScript Strict Mode:**
- `verbatimModuleSyntax` enabled
- Type-only imports required: `import type { Scan } from '../api'`
- Explicit `ReturnType<typeof setTimeout>` for timer refs
- No implicit any
## CLI Tool Architecture
### Key Technical Components
1. **Google Flights Scraping with SOCS Cookie Bypass**
- Uses `fast-flights v3.0rc1` (must install from GitHub, not PyPI)
- Custom `SOCSCookieIntegration` class in `searcher_v3.py` (lines 32-79) bypasses Google's EU consent page
- SOCS cookie value from: https://github.com/AWeirdDev/flights/issues/46
- Uses `primp` library for browser impersonation (Chrome 145, macOS)
2. **Async + Threading Hybrid Pattern**
- Main async layer: `search_multiple_routes()` uses asyncio with semaphore for concurrency
- Sync bridge: `asyncio.to_thread()` wraps the synchronous `get_flights()` calls
- Random delays (0.5-1.5s) between requests to avoid rate limiting
- Default concurrency: 5 workers (configurable with `--workers`)
3. **SQLite Caching System** (`cache.py`)
- Two-table schema: `flight_searches` (queries) + `flight_results` (flight data)
- Cache key: SHA256 hash of `origin|destination|date|seat_class|adults`
- Default threshold: 24 hours (configurable with `--cache-threshold`)
- Automatic cache hit detection with progress indicator
- Admin tool: `cache_admin.py` for stats/cleanup
4. **Seasonal Scanning & New Connection Detection**
- `resolve_dates()`: Generates one date per month (default: 15th) across window
- `detect_new_connections()`: Compares route sets month-over-month
- Tags routes as ✨ NEW the first month they appear after being absent
### Critical Error Handling Pattern
**IMPORTANT:** The parser in `searcher_v3.py` (lines 218-302) uses defensive None-checking throughout:
```python
# Always validate before accessing list elements
if not isinstance(flight_segments, list):
continue
if len(flight_segments) == 0:
continue
segment = flight_segments[0]
# Validate segment is not None
if segment is None:
continue
```
**Why:** Google Flights returns different JSON structures depending on availability. Some "no results" responses contain `None` elements or unexpected structures. See `DEBUG_SESSION_2026-02-22_RESOLVED.md` for full analysis.
**Known Issue:** The fast-flights library itself has a bug at `parser.py:55` where it tries to access `payload[3][0]` when `payload[3]` is None. This affects ~11% of edge cases (routes with no flights on specific dates). Our error handling gracefully catches this and returns empty results instead of crashing. Success rate: 89%.
### Module Responsibilities
- **`main.py`**: CLI entrypoint (Click), argument parsing, orchestration
- **`searcher_v3.py`**: Flight queries with SOCS cookie integration, caching, concurrency
- **`date_resolver.py`**: Date logic, seasonal window generation, new connection detection
- **`airports.py`**: Airport data management (OpenFlights dataset), country resolution
- **`formatter.py`**: Output formatting (Rich tables, JSON, CSV)
- **`cache.py`**: SQLite caching layer with timestamp-based invalidation
- **`progress.py`**: Real-time progress display using Rich Live tables
## Common Development Commands
### Web Application
**Backend Development:**
```bash
# Start API server (development mode with auto-reload)
python api_server.py
# Access: http://localhost:8000
# API docs: http://localhost:8000/docs
# Initialize/reset database
python database/init_db.py
# Run backend tests only
pytest tests/test_api_endpoints.py -v
# Run integration tests
pytest tests/test_integration.py -v
# Run all tests with coverage
pytest tests/ -v --cov=api_server --cov=database --cov-report=html
```
**Frontend Development:**
```bash
cd frontend
# Install dependencies (first time)
npm install
# Start dev server with hot reload
npm run dev
# Access: http://localhost:5173
# Note: Vite proxy forwards /api/* to http://localhost:8000
# Type checking
npm run build # Runs tsc -b first
# Lint
npm run lint
# Production build
npm run build
# Output: frontend/dist/
# Preview production build
npm run preview
```
**Docker Deployment:**
```bash
# Quick start (build + start both services)
docker-compose up -d
# View logs
docker-compose logs -f
# Rebuild after code changes
docker-compose up --build
# Stop services
docker-compose down
# Access application
# Frontend: http://localhost
# Backend API: http://localhost:8000
# Database backup
docker cp flight-radar-backend:/app/cache.db ./backup.db
# Database restore
docker cp ./backup.db flight-radar-backend:/app/cache.db
docker-compose restart backend
```
**Testing Web App:**
```bash
# Run all 43 tests
pytest tests/ -v
# Run specific test file
pytest tests/test_api_endpoints.py::test_health_endpoint -v
# Run tests with markers
pytest tests/ -v -m "unit"
pytest tests/ -v -m "integration"
# Coverage report
pytest tests/ --cov-report=term --cov-report=html
# Open: htmlcov/index.html
```
### CLI Tool
**Running the Tool**
```bash
# Single date query
python main.py --to BDS --country DE --date 2026-04-15
# Seasonal scan (6 months, queries 15th of each month)
python main.py --to BDS --country DE
# Daily scan (every day for 3 months) - NEW in 2026-02-22
python main.py --from BDS --to DUS --daily-scan --window 3
# Daily scan with custom date range - NEW in 2026-02-22
python main.py --from BDS --to-country DE --daily-scan --start-date 2026-04-01 --end-date 2026-04-30
# Dry run (preview without API calls)
python main.py --to BDS --country DE --dry-run
# With specific airports and custom workers
python main.py --to BDS --from DUS,MUC,FMM --date 2026-04-15 --workers 1
# Force fresh queries (ignore cache)
python main.py --to BDS --country DE --no-cache
```
### Testing
```bash
# Run full test suite
pytest tests/ -v
# Run integration tests (make real API calls — slow)
pytest tests/ -v -m integration
# Module-specific smoke tests
pytest tests/test_date_resolver.py tests/test_airports.py tests/test_searcher.py tests/test_formatter.py -v
```
### Cache Management
```bash
# View cache statistics
python cache_admin.py stats
# Clean old entries (30+ days)
python cache_admin.py clean --days 30
# Clear entire cache
python cache_admin.py clear-all
```
### Installation & Dependencies
```bash
# CRITICAL: Must install fast-flights v3 from GitHub (not PyPI)
pip install --upgrade git+https://github.com/AWeirdDev/flights.git
# Install other dependencies
pip install -r requirements.txt
# Build airport database (runs automatically on first use)
python airports.py
```
## Code Patterns & Conventions
### Web Application Patterns
**CRITICAL: Foreign Keys Must Be Enabled**
SQLite disables foreign keys by default. **Always** execute `PRAGMA foreign_keys = ON` after creating a connection:
```python
# Correct pattern (database/__init__.py)
conn = sqlite3.connect(db_path)
conn.execute("PRAGMA foreign_keys = ON")
# In tests (tests/conftest.py)
@pytest.fixture
def clean_database(test_db_path):
conn = get_connection()
conn.execute("PRAGMA foreign_keys = ON") # ← REQUIRED
# ... rest of fixture
```
**Why:** Without this, CASCADE deletes don't work, foreign key constraints aren't enforced, and data integrity is compromised.
**Rate Limiting: Per-Endpoint Per-IP**
The RateLimiter class tracks limits independently for each endpoint:
```python
# api_server.py lines 102-150
class RateLimiter:
def __init__(self):
self.requests = defaultdict(lambda: defaultdict(deque))
# Structure: {endpoint: {ip: deque([timestamps])}}
```
**Why:** Prevents a single IP from exhausting the scan quota (10/min) by making log requests (30/min). Each endpoint has independent limits.
**Validation: Auto-Normalization**
Pydantic validators auto-normalize inputs:
```python
class CreateScanRequest(BaseModel):
origin: str
country: str
@validator('origin', 'country', pre=True)
def uppercase_codes(cls, v):
return v.strip().upper() if v else v
```
**Result:** Frontend can send lowercase codes, backend normalizes them. Consistent database format.
**API Responses: Consistent Format**
All endpoints return:
- Success: `{ data: T, metadata?: {...} }`
- Error: `{ detail: string | object, request_id: string }`
- Paginated: `{ items: T[], pagination: { page, limit, total, pages } }`
**Database: JSON Arrays for Airlines**
The `routes.airlines` column stores JSON arrays:
```python
# Saving (api_server.py line ~1311)
json.dumps(route['airlines'])
# Loading (api_server.py line ~1100)
json.loads(row['airlines']) if row['airlines'] else []
```
**Why:** SQLite doesn't have array types. JSON serialization maintains type safety.
**Frontend: Type-Only Imports**
With `verbatimModuleSyntax` enabled:
```typescript
// ❌ Wrong - runtime import of type
import { Scan } from '../api'
// ✅ Correct - type-only import
import type { Scan } from '../api'
```
**Error if wrong:** `'Scan' is a type and must be imported using a type-only import`
**Frontend: Timer Refs**
```typescript
// ❌ Wrong - no NodeJS in browser
const timer = useRef<NodeJS.Timeout>()
// ✅ Correct - ReturnType utility
const timer = useRef<ReturnType<typeof setTimeout> | undefined>(undefined)
```
**Frontend: Debounced Search**
Pattern used in AirportSearch.tsx:
```typescript
const debounceTimer = useRef<ReturnType<typeof setTimeout> | undefined>(undefined);
const handleInputChange = (e) => {
if (debounceTimer.current) {
clearTimeout(debounceTimer.current);
}
debounceTimer.current = setTimeout(() => {
// API call here
}, 300);
};
// Cleanup on unmount
useEffect(() => {
return () => {
if (debounceTimer.current) {
clearTimeout(debounceTimer.current);
}
};
}, []);
```
### CLI Tool Error Handling Philosophy
**Graceful degradation over crashes:**
- Always wrap parsing in try/except with detailed logging
- Return empty lists `[]` instead of raising exceptions
- Log errors with full traceback but continue processing other routes
- Progress callback reports errors but search continues
Example from `searcher_v3.py`:
```python
except Exception as parse_error:
import traceback
print(f"\n=== PARSING ERROR ===")
print(f"Query: {origin}{destination} on {date}")
traceback.print_exc()
# Return empty list instead of crashing
return []
```
### Defensive Programming for API Responses
When working with flight data from fast-flights:
1. **Always** check `isinstance()` before assuming type
2. **Always** validate list is not empty before accessing `[0]`
3. **Always** check element is not `None` after accessing
4. **Always** use `getattr(obj, 'attr', default)` for optional fields
5. **Always** handle both `[H]` and `[H, M]` time formats
### Async/Await Patterns
- Use `asyncio.to_thread()` to bridge sync libraries (fast-flights) with async code
- Use `asyncio.Semaphore()` to limit concurrent requests
- Use `asyncio.gather()` to execute all tasks in parallel
- Add random delays (`asyncio.sleep(random.uniform(0.5, 1.5))`) to avoid rate limiting
### Cache-First Strategy
1. Check cache first with `get_cached_results()`
2. On cache miss, query API and save with `save_results()`
3. Report cache hits via progress callback
4. Respect `use_cache` flag and `cache_threshold_hours` parameter
## Important Constants
From `date_resolver.py`:
```python
SEARCH_WINDOW_MONTHS = 6 # Default seasonal scan window
SAMPLE_DAY_OF_MONTH = 15 # Which day to query each month (seasonal mode only)
```
From `cache.py`:
```python
DEFAULT_CACHE_THRESHOLD_HOURS = 24
```
## Debugging Tips
### When Flight Searches Fail
1. Look for patterns in error logs:
- `'NoneType' object is not subscriptable` → Missing None validation in `searcher_v3.py`
- `fast-flights/parser.py line 55` → Library bug, can't fix without patching (~11% of edge cases)
2. Verify SOCS cookie is still valid (see `docs/MIGRATION_V3.md` for refresh instructions)
3. Run with `--workers 1` to rule out concurrency as the cause
### Performance Issues
- Reduce `--window` for faster seasonal scans
- Increase `--workers` (but watch rate limiting)
- Use `--from` with specific airports instead of `--country`
- Check cache hit rate with `cache_admin.py stats`
### Concurrency Issues
- Start with `--workers 1` to isolate non-concurrency bugs
- Gradually increase workers while monitoring error rates
- Note: Error rates can differ between sequential and concurrent execution, suggesting rate limiting or response variation
## Testing Philosophy
- **Smoke tests** in `tests/` verify each module works independently
- **Integration tests** (`-m integration`) make real API calls — use confirmed routes from `tests/confirmed_flights.json`
- Always test with `--workers 1` first when debugging to isolate concurrency issues
## Known Limitations
1. **fast-flights library dependency:** Subject to Google's anti-bot measures and API changes
2. **Rate limiting:** Large scans (100+ airports) may hit rate limits despite delays
3. **EU consent flow:** Relies on SOCS cookie workaround which may break if Google changes their system
4. **Parser bug in fast-flights:** ~11% failure rate on edge cases (gracefully handled — returns empty result)
5. **Prices are snapshots:** Not final booking prices, subject to availability changes
6. **Prices are snapshots:** Not final booking prices, subject to availability changes
## Documentation
- **`README.md`**: Main entry point and usage guide
- **`docs/DEPLOYMENT.md`**: Comprehensive deployment guide (Docker + manual)
- **`docs/DOCKER_README.md`**: Docker quick-start guide
- **`docs/DECISIONS.md`**: Architecture and design decisions
- **`docs/MIGRATION_V3.md`**: fast-flights v2→v3 migration and SOCS cookie refresh
- **`docs/CACHING.md`**: SQLite caching layer reference
- **`database/schema.sql`**: Database schema with full comments
- **`tests/confirmed_flights.json`**: Ground-truth flight data for integration tests
## Environment Variables
### Web Application
**Backend (api_server.py):**
```bash
# Server configuration
PORT=8000 # API server port
HOST=0.0.0.0 # Bind address (0.0.0.0 for Docker)
# Database
DATABASE_PATH=/app/data/cache.db # SQLite database path
# CORS
ALLOWED_ORIGINS=http://localhost,http://localhost:80 # Comma-separated
# Logging
LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR, CRITICAL
# Rate limiting (requests per minute per IP)
RATE_LIMIT_SCANS=10
RATE_LIMIT_LOGS=30
RATE_LIMIT_AIRPORTS=100
```
**Frontend (vite.config.ts):**
```bash
# Build-time only
VITE_API_BASE_URL=/api/v1 # API base URL (usually use proxy instead)
```
**Docker (.env file):**
```bash
# Service ports
BACKEND_PORT=8000
FRONTEND_PORT=80
# All backend variables above also apply
```
**Configuration Files:**
- `.env.example` - Template with all variables documented (72 lines)
- `frontend/vite.config.ts` - API proxy for development
- `nginx.conf` - API proxy for production
### CLI Tool Environment
No environment variables required for CLI tool. All configuration via command-line flags.
## When Making Changes
### Web Application Changes
**Before Modifying API Endpoints (api_server.py):**
1. **Always read existing code first** to understand request/response patterns
2. **Update Pydantic models** if adding new fields
3. **Add validation** with descriptive error messages
4. **Update frontend API client** (frontend/src/api.ts) with new types
5. **Add tests** in tests/test_api_endpoints.py
6. **Update rate limiting** if adding new endpoint
7. **Document in API docs** (FastAPI auto-generates from docstrings)
**Before Modifying Database Schema (database/schema.sql):**
1. **CRITICAL:** Test migration path from existing data
2. **Add migration logic** to database/init_db.py
3. **Update CHECK constraints** if changing validation rules
4. **Add/update indexes** for new query patterns
5. **Test foreign key cascades** work correctly
6. **Update tests** in tests/test_integration.py
7. **Backup production data** before applying
Example migration pattern:
```python
# database/init_db.py
def migrate_to_v2(conn):
"""Add new column with default value."""
try:
conn.execute("ALTER TABLE scans ADD COLUMN new_field TEXT DEFAULT 'default'")
conn.commit()
except sqlite3.OperationalError:
# Column already exists, skip
pass
```
**Before Modifying Frontend Components:**
1. **Check TypeScript strict mode** requirements (type-only imports)
2. **Update API client types** (src/api.ts) if API changed
3. **Test responsive design** on mobile/tablet/desktop
4. **Verify error handling** with network failures
5. **Check accessibility** (keyboard navigation, screen readers)
6. **Update tests** if adding testable logic
7. **Verify production build** with `npm run build`
**Before Modifying Docker Configuration:**
1. **Validate docker-compose.yml** with `docker-compose config`
2. **Test build** with `docker-compose build --no-cache`
3. **Verify health checks** work correctly
4. **Test volume persistence** (database survives restarts)
5. **Check environment variables** are properly passed
6. **Update documentation** (`docs/DEPLOYMENT.md`, `docs/DOCKER_README.md`)
7. **Test full deployment** from scratch
**Rate Limiting Changes:**
When modifying rate limits:
1. Update constants in api_server.py
2. Update .env.example with new defaults
3. Consider impact on user experience (too strict = frustrated users)
4. Test with concurrent requests
5. Document in API response headers
**Common Pitfalls:**
1. **Forgetting foreign keys:** Add `PRAGMA foreign_keys = ON` to every connection
2. **Type-only imports:** Use `import type` for interfaces in TypeScript
3. **JSON arrays:** Remember to `json.loads()` when reading airlines from database
4. **Rate limiting:** New endpoints need rate limit decorator
5. **CORS:** Add new origins to ALLOWED_ORIGINS env var
6. **Cache invalidation:** Frontend may cache old data, handle with ETags or timestamps
### CLI Tool Changes
**Before Modifying Parser (`searcher_v3.py`)
### Before Modifying Parser (`searcher_v3.py`)
1. Maintain the layered validation pattern: type check → empty check → None check (see lines 218-302)
2. Run `pytest tests/test_scan_pipeline.py -m integration` to verify known routes still return flights
3. Add comprehensive error logging with tracebacks for debugging
### Before Modifying Caching (`cache.py`)
1. Understand the two-table schema: searches + results
2. Remember that cache keys include ALL query parameters (origin, destination, date, seat_class, adults)
3. Test cache invalidation logic with different threshold values
4. Verify foreign key cascade deletes work correctly
### Before Modifying Async Logic (`searcher_v3.py`, `main.py`)
1. Respect the sync/async boundary: fast-flights is synchronous, use `asyncio.to_thread()`
2. Always use semaphores to limit concurrency (prevent rate limiting)
3. Test with different `--workers` values (1, 3, 5, 10) to verify behavior
4. Add random delays between requests to avoid anti-bot detection
### Before Adding New CLI Arguments (`main.py`)
1. Update Click options with proper help text and defaults
2. Update `README.md` usage examples
3. Update `PRD.MD` if changing core functionality
4. Consider cache implications (new parameter = new cache key dimension)
---
## Project Status
### Web Application: ✅ PRODUCTION READY
**Completed:** All 30 steps across 4 phases (100% complete)
**Phase 1: Backend Foundation** - ✅ 10/10 steps
- Database schema with triggers and views
- FastAPI REST API with validation
- Error handling and rate limiting
- Startup cleanup for stuck scans
- Log viewer endpoint
**Phase 2: Testing Infrastructure** - ✅ 5/5 steps
- pytest configuration
- 43 passing tests (26 unit + 15 integration)
- 75% code coverage
- Database isolation in tests
- Test fixtures and factories
**Phase 3: Frontend Development** - ✅ 10/10 steps
- React + TypeScript app with Vite
- Tailwind CSS v4 styling
- 5 pages + 5 components
- Type-safe API client
- Error boundary and toast notifications
- Production build: 293 KB (93 KB gzipped)
**Phase 4: Docker Deployment** - ✅ 5/5 steps
- Multi-stage Docker builds
- Docker Compose orchestration
- Nginx reverse proxy
- Volume persistence
- Health checks and auto-restart
**Quick Start:**
```bash
docker-compose up -d
open http://localhost
```
### CLI Tool: ✅ FUNCTIONAL
- Successfully queries Google Flights via fast-flights v3 with SOCS cookie
- 89% success rate on real flight queries
- Caching system reduces API calls
- Seasonal scanning and new route detection
- Rich terminal output
**Known limitations:** fast-flights library parser bug affects ~11% of edge cases (documented in DEBUG_SESSION_2026-02-22_RESOLVED.md)
---
**Total Project:**
- ~3,300+ lines of production code
- ~2,500+ lines of documentation
- 43/43 tests passing
- Zero TODO/FIXME comments
- Docker validated
- Ready for deployment