862 lines
27 KiB
Markdown
862 lines
27 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Workflow Rules
|
|
|
|
- **After every successful test run, commit and push.** Stage only the files changed for that task, write a concise commit message describing what was fixed or added, then `git push`. This keeps the remote in sync with every verified milestone.
|
|
|
|
## Project Overview
|
|
|
|
This repository contains **two applications**:
|
|
|
|
1. **Flight Airport Comparator CLI** - Python CLI tool for flight comparisons
|
|
2. **Flight Radar Web App** - Full-stack web application with REST API, React frontend, and Docker deployment
|
|
|
|
### CLI Tool
|
|
|
|
A Python CLI tool that compares direct flights from multiple airports in a country to a single destination, using Google Flights data via the fast-flights library.
|
|
|
|
**Core question it answers:** "I want to fly to [DESTINATION]. Which airport in [COUNTRY] should I depart from — and when in the next 6 months does the best route open up?"
|
|
|
|
### Web Application
|
|
|
|
A production-ready web application providing:
|
|
- REST API (FastAPI) with rate limiting, validation, and error handling
|
|
- React + TypeScript frontend with real-time updates
|
|
- SQLite database with automatic schema migrations
|
|
- Docker deployment with health checks
|
|
- 43 passing tests with 75% code coverage
|
|
|
|
## Web Application Architecture
|
|
|
|
### Tech Stack
|
|
|
|
**Backend:**
|
|
- FastAPI 0.104+ with Pydantic v2 for validation
|
|
- SQLite database with foreign keys enabled
|
|
- Uvicorn ASGI server
|
|
- Python 3.11+
|
|
|
|
**Frontend:**
|
|
- React 19 with TypeScript (strict mode)
|
|
- Vite 7 for build tooling
|
|
- Tailwind CSS v4 with @tailwindcss/postcss
|
|
- React Router v7 for client-side routing
|
|
- Axios for API requests
|
|
|
|
**Infrastructure:**
|
|
- Docker multi-stage builds
|
|
- Docker Compose orchestration
|
|
- Nginx reverse proxy for production
|
|
- Volume persistence for database
|
|
|
|
### Web App File Structure
|
|
|
|
```
|
|
flight-comparator/
|
|
├── api_server.py # FastAPI app (1,300+ lines)
|
|
├── database/
|
|
│ ├── __init__.py # Connection utilities
|
|
│ ├── init_db.py # Schema initialization
|
|
│ └── schema.sql # Database schema (scans, routes tables)
|
|
├── frontend/
|
|
│ ├── src/
|
|
│ │ ├── api.ts # Type-safe API client (308 lines)
|
|
│ │ ├── components/ # React components
|
|
│ │ │ ├── Layout.tsx
|
|
│ │ │ ├── AirportSearch.tsx
|
|
│ │ │ ├── ErrorBoundary.tsx
|
|
│ │ │ ├── Toast.tsx
|
|
│ │ │ └── LoadingSpinner.tsx
|
|
│ │ └── pages/ # Page components
|
|
│ │ ├── Dashboard.tsx
|
|
│ │ ├── Scans.tsx
|
|
│ │ ├── ScanDetails.tsx
|
|
│ │ ├── Airports.tsx
|
|
│ │ └── Logs.tsx
|
|
│ ├── package.json
|
|
│ └── vite.config.ts # Vite config with API proxy
|
|
├── tests/
|
|
│ ├── conftest.py # Pytest fixtures
|
|
│ ├── test_api_endpoints.py # 26 unit tests
|
|
│ └── test_integration.py # 15 integration tests
|
|
├── Dockerfile.backend # Python backend container
|
|
├── Dockerfile.frontend # Node + Nginx container
|
|
├── docker-compose.yml # Service orchestration
|
|
└── nginx.conf # Nginx configuration
|
|
|
|
Total: ~3,300 lines of production code
|
|
```
|
|
|
|
### Database Schema
|
|
|
|
**Table: scans**
|
|
- Tracks scan requests with status (pending → running → completed/failed)
|
|
- Foreign keys enabled with CASCADE deletes
|
|
- CHECK constraints for IATA codes (3 chars) and ISO country codes (2 chars)
|
|
- Auto-updated timestamps via triggers
|
|
- Indexes on `(origin, country)`, `status`, and `created_at`
|
|
|
|
**Table: routes**
|
|
- Stores discovered routes per scan (foreign key to scans.id)
|
|
- Flight statistics: min/max/avg price, flight count, airlines array (JSON)
|
|
- Composite index on `(scan_id, min_price)` for sorted queries
|
|
|
|
**Views:**
|
|
- `scan_statistics` - Aggregated stats per scan
|
|
- `recent_scans` - Last 100 scans with route counts
|
|
|
|
### API Architecture (api_server.py)
|
|
|
|
**Key Classes:**
|
|
|
|
1. **LogBuffer + BufferedLogHandler** (lines 48-100)
|
|
- Thread-safe circular buffer for application logs
|
|
- Custom logging handler that stores logs in memory
|
|
- Supports filtering by level and search
|
|
|
|
2. **RateLimiter** (lines 102-150)
|
|
- Sliding window rate limiting per endpoint per IP
|
|
- Independent tracking for each endpoint
|
|
- X-Forwarded-For support for proxy setups
|
|
- Rate limit headers on all responses
|
|
|
|
3. **Pydantic Models** (lines 152-300)
|
|
- Input validation with auto-normalization (lowercase → uppercase)
|
|
- Custom validators for IATA codes (3 chars), ISO codes (2 chars), dates
|
|
- Generic PaginatedResponse[T] model for consistent pagination
|
|
- Detailed validation error messages
|
|
|
|
**API Endpoints:**
|
|
|
|
| Method | Path | Purpose | Rate Limit |
|
|
|--------|------|---------|------------|
|
|
| GET | `/health` | Health check | No limit |
|
|
| GET | `/api/v1/airports` | Search airports | 100/min |
|
|
| POST | `/api/v1/scans` | Create scan | 10/min |
|
|
| GET | `/api/v1/scans` | List scans | 30/min |
|
|
| GET | `/api/v1/scans/{id}` | Get scan details | 30/min |
|
|
| GET | `/api/v1/scans/{id}/routes` | Get routes | 30/min |
|
|
| GET | `/api/v1/logs` | View logs | 30/min |
|
|
|
|
**Middleware Stack:**
|
|
1. Request ID middleware (UUID per request)
|
|
2. CORS middleware (configurable origins via `ALLOWED_ORIGINS` env var)
|
|
3. Rate limiting middleware (per-endpoint per-IP)
|
|
4. Custom exception handlers (validation, HTTP, general)
|
|
|
|
**Startup Logic:**
|
|
- Downloads airport data from OpenFlights
|
|
- Initializes database schema
|
|
- Detects and fixes stuck scans (status=running with no update > 1 hour)
|
|
- Enables SQLite foreign keys globally
|
|
|
|
### Frontend Architecture
|
|
|
|
**Routing:**
|
|
- `/` - Dashboard with stats cards and recent scans
|
|
- `/scans` - Create new scan form
|
|
- `/scans/:id` - View scan details and routes table
|
|
- `/airports` - Search airport database
|
|
- `/logs` - Application log viewer
|
|
|
|
**State Management:**
|
|
- Local component state with React hooks (useState, useEffect)
|
|
- No global state library (Redux, Context) - API is source of truth
|
|
- Optimistic UI updates with error rollback
|
|
|
|
**API Client Pattern (src/api.ts):**
|
|
```typescript
|
|
// Type-safe interfaces for all API responses
|
|
export interface Scan { id: number; origin: string; ... }
|
|
export interface Route { id: number; destination: string; ... }
|
|
|
|
// Organized by resource
|
|
export const scanApi = {
|
|
list: (page, limit, status?) => api.get<PaginatedResponse<Scan>>(...),
|
|
create: (data) => api.post<CreateScanResponse>(...),
|
|
get: (id) => api.get<Scan>(...),
|
|
routes: (id, page, limit) => api.get<PaginatedResponse<Route>>(...)
|
|
};
|
|
```
|
|
|
|
**Error Handling:**
|
|
- ErrorBoundary component catches React errors
|
|
- Toast notifications for user feedback (4 types: success, error, info, warning)
|
|
- LoadingSpinner for async operations
|
|
- Graceful fallbacks for missing data
|
|
|
|
**TypeScript Strict Mode:**
|
|
- `verbatimModuleSyntax` enabled
|
|
- Type-only imports required: `import type { Scan } from '../api'`
|
|
- Explicit `ReturnType<typeof setTimeout>` for timer refs
|
|
- No implicit any
|
|
|
|
## CLI Tool Architecture
|
|
|
|
### Key Technical Components
|
|
|
|
1. **Google Flights Scraping with SOCS Cookie Bypass**
|
|
- Uses `fast-flights v3.0rc1` (must install from GitHub, not PyPI)
|
|
- Custom `SOCSCookieIntegration` class in `searcher_v3.py` (lines 32-79) bypasses Google's EU consent page
|
|
- SOCS cookie value from: https://github.com/AWeirdDev/flights/issues/46
|
|
- Uses `primp` library for browser impersonation (Chrome 145, macOS)
|
|
|
|
2. **Async + Threading Hybrid Pattern**
|
|
- Main async layer: `search_multiple_routes()` uses asyncio with semaphore for concurrency
|
|
- Sync bridge: `asyncio.to_thread()` wraps the synchronous `get_flights()` calls
|
|
- Random delays (0.5-1.5s) between requests to avoid rate limiting
|
|
- Default concurrency: 5 workers (configurable with `--workers`)
|
|
|
|
3. **SQLite Caching System** (`cache.py`)
|
|
- Two-table schema: `flight_searches` (queries) + `flight_results` (flight data)
|
|
- Cache key: SHA256 hash of `origin|destination|date|seat_class|adults`
|
|
- Default threshold: 24 hours (configurable with `--cache-threshold`)
|
|
- Automatic cache hit detection with progress indicator
|
|
- Admin tool: `cache_admin.py` for stats/cleanup
|
|
|
|
4. **Seasonal Scanning & New Connection Detection**
|
|
- `resolve_dates()`: Generates one date per month (default: 15th) across window
|
|
- `detect_new_connections()`: Compares route sets month-over-month
|
|
- Tags routes as ✨ NEW the first month they appear after being absent
|
|
|
|
### Critical Error Handling Pattern
|
|
|
|
**IMPORTANT:** The parser in `searcher_v3.py` (lines 218-302) uses defensive None-checking throughout:
|
|
|
|
```python
|
|
# Always validate before accessing list elements
|
|
if not isinstance(flight_segments, list):
|
|
continue
|
|
|
|
if len(flight_segments) == 0:
|
|
continue
|
|
|
|
segment = flight_segments[0]
|
|
|
|
# Validate segment is not None
|
|
if segment is None:
|
|
continue
|
|
```
|
|
|
|
**Why:** Google Flights returns different JSON structures depending on availability. Some "no results" responses contain `None` elements or unexpected structures. See `DEBUG_SESSION_2026-02-22_RESOLVED.md` for full analysis.
|
|
|
|
**Known Issue:** The fast-flights library itself has a bug at `parser.py:55` where it tries to access `payload[3][0]` when `payload[3]` is None. This affects ~11% of edge cases (routes with no flights on specific dates). Our error handling gracefully catches this and returns empty results instead of crashing. Success rate: 89%.
|
|
|
|
### Module Responsibilities
|
|
|
|
- **`main.py`**: CLI entrypoint (Click), argument parsing, orchestration
|
|
- **`searcher_v3.py`**: Flight queries with SOCS cookie integration, caching, concurrency
|
|
- **`date_resolver.py`**: Date logic, seasonal window generation, new connection detection
|
|
- **`airports.py`**: Airport data management (OpenFlights dataset), country resolution
|
|
- **`formatter.py`**: Output formatting (Rich tables, JSON, CSV)
|
|
- **`cache.py`**: SQLite caching layer with timestamp-based invalidation
|
|
- **`progress.py`**: Real-time progress display using Rich Live tables
|
|
|
|
## Common Development Commands
|
|
|
|
### Web Application
|
|
|
|
**Backend Development:**
|
|
```bash
|
|
# Start API server (development mode with auto-reload)
|
|
python api_server.py
|
|
# Access: http://localhost:8000
|
|
# API docs: http://localhost:8000/docs
|
|
|
|
# Initialize/reset database
|
|
python database/init_db.py
|
|
|
|
# Run backend tests only
|
|
pytest tests/test_api_endpoints.py -v
|
|
|
|
# Run integration tests
|
|
pytest tests/test_integration.py -v
|
|
|
|
# Run all tests with coverage
|
|
pytest tests/ -v --cov=api_server --cov=database --cov-report=html
|
|
```
|
|
|
|
**Frontend Development:**
|
|
```bash
|
|
cd frontend
|
|
|
|
# Install dependencies (first time)
|
|
npm install
|
|
|
|
# Start dev server with hot reload
|
|
npm run dev
|
|
# Access: http://localhost:5173
|
|
# Note: Vite proxy forwards /api/* to http://localhost:8000
|
|
|
|
# Type checking
|
|
npm run build # Runs tsc -b first
|
|
|
|
# Lint
|
|
npm run lint
|
|
|
|
# Production build
|
|
npm run build
|
|
# Output: frontend/dist/
|
|
|
|
# Preview production build
|
|
npm run preview
|
|
```
|
|
|
|
**Docker Deployment:**
|
|
```bash
|
|
# Quick start (build + start both services)
|
|
docker-compose up -d
|
|
|
|
# View logs
|
|
docker-compose logs -f
|
|
|
|
# Rebuild after code changes
|
|
docker-compose up --build
|
|
|
|
# Stop services
|
|
docker-compose down
|
|
|
|
# Access application
|
|
# Frontend: http://localhost
|
|
# Backend API: http://localhost:8000
|
|
|
|
# Database backup
|
|
docker cp flight-radar-backend:/app/cache.db ./backup.db
|
|
|
|
# Database restore
|
|
docker cp ./backup.db flight-radar-backend:/app/cache.db
|
|
docker-compose restart backend
|
|
```
|
|
|
|
**Testing Web App:**
|
|
```bash
|
|
# Run all 43 tests
|
|
pytest tests/ -v
|
|
|
|
# Run specific test file
|
|
pytest tests/test_api_endpoints.py::test_health_endpoint -v
|
|
|
|
# Run tests with markers
|
|
pytest tests/ -v -m "unit"
|
|
pytest tests/ -v -m "integration"
|
|
|
|
# Coverage report
|
|
pytest tests/ --cov-report=term --cov-report=html
|
|
# Open: htmlcov/index.html
|
|
```
|
|
|
|
### CLI Tool
|
|
|
|
**Running the Tool**
|
|
|
|
```bash
|
|
# Single date query
|
|
python main.py --to BDS --country DE --date 2026-04-15
|
|
|
|
# Seasonal scan (6 months, queries 15th of each month)
|
|
python main.py --to BDS --country DE
|
|
|
|
# Daily scan (every day for 3 months) - NEW in 2026-02-22
|
|
python main.py --from BDS --to DUS --daily-scan --window 3
|
|
|
|
# Daily scan with custom date range - NEW in 2026-02-22
|
|
python main.py --from BDS --to-country DE --daily-scan --start-date 2026-04-01 --end-date 2026-04-30
|
|
|
|
# Dry run (preview without API calls)
|
|
python main.py --to BDS --country DE --dry-run
|
|
|
|
# With specific airports and custom workers
|
|
python main.py --to BDS --from DUS,MUC,FMM --date 2026-04-15 --workers 1
|
|
|
|
# Force fresh queries (ignore cache)
|
|
python main.py --to BDS --country DE --no-cache
|
|
```
|
|
|
|
### Testing
|
|
|
|
```bash
|
|
# Run full test suite
|
|
pytest tests/ -v
|
|
|
|
# Run integration tests (make real API calls — slow)
|
|
pytest tests/ -v -m integration
|
|
|
|
# Module-specific smoke tests
|
|
pytest tests/test_date_resolver.py tests/test_airports.py tests/test_searcher.py tests/test_formatter.py -v
|
|
```
|
|
|
|
### Cache Management
|
|
|
|
```bash
|
|
# View cache statistics
|
|
python cache_admin.py stats
|
|
|
|
# Clean old entries (30+ days)
|
|
python cache_admin.py clean --days 30
|
|
|
|
# Clear entire cache
|
|
python cache_admin.py clear-all
|
|
```
|
|
|
|
### Installation & Dependencies
|
|
|
|
```bash
|
|
# CRITICAL: Must install fast-flights v3 from GitHub (not PyPI)
|
|
pip install --upgrade git+https://github.com/AWeirdDev/flights.git
|
|
|
|
# Install other dependencies
|
|
pip install -r requirements.txt
|
|
|
|
# Build airport database (runs automatically on first use)
|
|
python airports.py
|
|
```
|
|
|
|
## Code Patterns & Conventions
|
|
|
|
### Web Application Patterns
|
|
|
|
**CRITICAL: Foreign Keys Must Be Enabled**
|
|
|
|
SQLite disables foreign keys by default. **Always** execute `PRAGMA foreign_keys = ON` after creating a connection:
|
|
|
|
```python
|
|
# Correct pattern (database/__init__.py)
|
|
conn = sqlite3.connect(db_path)
|
|
conn.execute("PRAGMA foreign_keys = ON")
|
|
|
|
# In tests (tests/conftest.py)
|
|
@pytest.fixture
|
|
def clean_database(test_db_path):
|
|
conn = get_connection()
|
|
conn.execute("PRAGMA foreign_keys = ON") # ← REQUIRED
|
|
# ... rest of fixture
|
|
```
|
|
|
|
**Why:** Without this, CASCADE deletes don't work, foreign key constraints aren't enforced, and data integrity is compromised.
|
|
|
|
**Rate Limiting: Per-Endpoint Per-IP**
|
|
|
|
The RateLimiter class tracks limits independently for each endpoint:
|
|
|
|
```python
|
|
# api_server.py lines 102-150
|
|
class RateLimiter:
|
|
def __init__(self):
|
|
self.requests = defaultdict(lambda: defaultdict(deque))
|
|
# Structure: {endpoint: {ip: deque([timestamps])}}
|
|
```
|
|
|
|
**Why:** Prevents a single IP from exhausting the scan quota (10/min) by making log requests (30/min). Each endpoint has independent limits.
|
|
|
|
**Validation: Auto-Normalization**
|
|
|
|
Pydantic validators auto-normalize inputs:
|
|
|
|
```python
|
|
class CreateScanRequest(BaseModel):
|
|
origin: str
|
|
country: str
|
|
|
|
@validator('origin', 'country', pre=True)
|
|
def uppercase_codes(cls, v):
|
|
return v.strip().upper() if v else v
|
|
```
|
|
|
|
**Result:** Frontend can send lowercase codes, backend normalizes them. Consistent database format.
|
|
|
|
**API Responses: Consistent Format**
|
|
|
|
All endpoints return:
|
|
- Success: `{ data: T, metadata?: {...} }`
|
|
- Error: `{ detail: string | object, request_id: string }`
|
|
- Paginated: `{ items: T[], pagination: { page, limit, total, pages } }`
|
|
|
|
**Database: JSON Arrays for Airlines**
|
|
|
|
The `routes.airlines` column stores JSON arrays:
|
|
|
|
```python
|
|
# Saving (api_server.py line ~1311)
|
|
json.dumps(route['airlines'])
|
|
|
|
# Loading (api_server.py line ~1100)
|
|
json.loads(row['airlines']) if row['airlines'] else []
|
|
```
|
|
|
|
**Why:** SQLite doesn't have array types. JSON serialization maintains type safety.
|
|
|
|
**Frontend: Type-Only Imports**
|
|
|
|
With `verbatimModuleSyntax` enabled:
|
|
|
|
```typescript
|
|
// ❌ Wrong - runtime import of type
|
|
import { Scan } from '../api'
|
|
|
|
// ✅ Correct - type-only import
|
|
import type { Scan } from '../api'
|
|
```
|
|
|
|
**Error if wrong:** `'Scan' is a type and must be imported using a type-only import`
|
|
|
|
**Frontend: Timer Refs**
|
|
|
|
```typescript
|
|
// ❌ Wrong - no NodeJS in browser
|
|
const timer = useRef<NodeJS.Timeout>()
|
|
|
|
// ✅ Correct - ReturnType utility
|
|
const timer = useRef<ReturnType<typeof setTimeout> | undefined>(undefined)
|
|
```
|
|
|
|
**Frontend: Debounced Search**
|
|
|
|
Pattern used in AirportSearch.tsx:
|
|
|
|
```typescript
|
|
const debounceTimer = useRef<ReturnType<typeof setTimeout> | undefined>(undefined);
|
|
|
|
const handleInputChange = (e) => {
|
|
if (debounceTimer.current) {
|
|
clearTimeout(debounceTimer.current);
|
|
}
|
|
|
|
debounceTimer.current = setTimeout(() => {
|
|
// API call here
|
|
}, 300);
|
|
};
|
|
|
|
// Cleanup on unmount
|
|
useEffect(() => {
|
|
return () => {
|
|
if (debounceTimer.current) {
|
|
clearTimeout(debounceTimer.current);
|
|
}
|
|
};
|
|
}, []);
|
|
```
|
|
|
|
### CLI Tool Error Handling Philosophy
|
|
|
|
**Graceful degradation over crashes:**
|
|
- Always wrap parsing in try/except with detailed logging
|
|
- Return empty lists `[]` instead of raising exceptions
|
|
- Log errors with full traceback but continue processing other routes
|
|
- Progress callback reports errors but search continues
|
|
|
|
Example from `searcher_v3.py`:
|
|
```python
|
|
except Exception as parse_error:
|
|
import traceback
|
|
print(f"\n=== PARSING ERROR ===")
|
|
print(f"Query: {origin}→{destination} on {date}")
|
|
traceback.print_exc()
|
|
# Return empty list instead of crashing
|
|
return []
|
|
```
|
|
|
|
### Defensive Programming for API Responses
|
|
|
|
When working with flight data from fast-flights:
|
|
1. **Always** check `isinstance()` before assuming type
|
|
2. **Always** validate list is not empty before accessing `[0]`
|
|
3. **Always** check element is not `None` after accessing
|
|
4. **Always** use `getattr(obj, 'attr', default)` for optional fields
|
|
5. **Always** handle both `[H]` and `[H, M]` time formats
|
|
|
|
### Async/Await Patterns
|
|
|
|
- Use `asyncio.to_thread()` to bridge sync libraries (fast-flights) with async code
|
|
- Use `asyncio.Semaphore()` to limit concurrent requests
|
|
- Use `asyncio.gather()` to execute all tasks in parallel
|
|
- Add random delays (`asyncio.sleep(random.uniform(0.5, 1.5))`) to avoid rate limiting
|
|
|
|
### Cache-First Strategy
|
|
|
|
1. Check cache first with `get_cached_results()`
|
|
2. On cache miss, query API and save with `save_results()`
|
|
3. Report cache hits via progress callback
|
|
4. Respect `use_cache` flag and `cache_threshold_hours` parameter
|
|
|
|
## Important Constants
|
|
|
|
From `date_resolver.py`:
|
|
```python
|
|
SEARCH_WINDOW_MONTHS = 6 # Default seasonal scan window
|
|
SAMPLE_DAY_OF_MONTH = 15 # Which day to query each month (seasonal mode only)
|
|
```
|
|
|
|
From `cache.py`:
|
|
```python
|
|
DEFAULT_CACHE_THRESHOLD_HOURS = 24
|
|
```
|
|
|
|
## Debugging Tips
|
|
|
|
### When Flight Searches Fail
|
|
|
|
1. Look for patterns in error logs:
|
|
- `'NoneType' object is not subscriptable` → Missing None validation in `searcher_v3.py`
|
|
- `fast-flights/parser.py line 55` → Library bug, can't fix without patching (~11% of edge cases)
|
|
2. Verify SOCS cookie is still valid (see `docs/MIGRATION_V3.md` for refresh instructions)
|
|
3. Run with `--workers 1` to rule out concurrency as the cause
|
|
|
|
### Performance Issues
|
|
|
|
- Reduce `--window` for faster seasonal scans
|
|
- Increase `--workers` (but watch rate limiting)
|
|
- Use `--from` with specific airports instead of `--country`
|
|
- Check cache hit rate with `cache_admin.py stats`
|
|
|
|
### Concurrency Issues
|
|
|
|
- Start with `--workers 1` to isolate non-concurrency bugs
|
|
- Gradually increase workers while monitoring error rates
|
|
- Note: Error rates can differ between sequential and concurrent execution, suggesting rate limiting or response variation
|
|
|
|
## Testing Philosophy
|
|
|
|
- **Smoke tests** in `tests/` verify each module works independently
|
|
- **Integration tests** (`-m integration`) make real API calls — use confirmed routes from `tests/confirmed_flights.json`
|
|
- Always test with `--workers 1` first when debugging to isolate concurrency issues
|
|
|
|
## Known Limitations
|
|
|
|
1. **fast-flights library dependency:** Subject to Google's anti-bot measures and API changes
|
|
2. **Rate limiting:** Large scans (100+ airports) may hit rate limits despite delays
|
|
3. **EU consent flow:** Relies on SOCS cookie workaround which may break if Google changes their system
|
|
4. **Parser bug in fast-flights:** ~11% failure rate on edge cases (gracefully handled — returns empty result)
|
|
5. **Prices are snapshots:** Not final booking prices, subject to availability changes
|
|
6. **Prices are snapshots:** Not final booking prices, subject to availability changes
|
|
|
|
## Documentation
|
|
|
|
- **`README.md`**: Main entry point and usage guide
|
|
- **`docs/DEPLOYMENT.md`**: Comprehensive deployment guide (Docker + manual)
|
|
- **`docs/DOCKER_README.md`**: Docker quick-start guide
|
|
- **`docs/DECISIONS.md`**: Architecture and design decisions
|
|
- **`docs/MIGRATION_V3.md`**: fast-flights v2→v3 migration and SOCS cookie refresh
|
|
- **`docs/CACHING.md`**: SQLite caching layer reference
|
|
- **`database/schema.sql`**: Database schema with full comments
|
|
- **`tests/confirmed_flights.json`**: Ground-truth flight data for integration tests
|
|
|
|
## Environment Variables
|
|
|
|
### Web Application
|
|
|
|
**Backend (api_server.py):**
|
|
```bash
|
|
# Server configuration
|
|
PORT=8000 # API server port
|
|
HOST=0.0.0.0 # Bind address (0.0.0.0 for Docker)
|
|
|
|
# Database
|
|
DATABASE_PATH=/app/data/cache.db # SQLite database path
|
|
|
|
# CORS
|
|
ALLOWED_ORIGINS=http://localhost,http://localhost:80 # Comma-separated
|
|
|
|
# Logging
|
|
LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR, CRITICAL
|
|
|
|
# Rate limiting (requests per minute per IP)
|
|
RATE_LIMIT_SCANS=10
|
|
RATE_LIMIT_LOGS=30
|
|
RATE_LIMIT_AIRPORTS=100
|
|
```
|
|
|
|
**Frontend (vite.config.ts):**
|
|
```bash
|
|
# Build-time only
|
|
VITE_API_BASE_URL=/api/v1 # API base URL (usually use proxy instead)
|
|
```
|
|
|
|
**Docker (.env file):**
|
|
```bash
|
|
# Service ports
|
|
BACKEND_PORT=8000
|
|
FRONTEND_PORT=80
|
|
|
|
# All backend variables above also apply
|
|
```
|
|
|
|
**Configuration Files:**
|
|
- `.env.example` - Template with all variables documented (72 lines)
|
|
- `frontend/vite.config.ts` - API proxy for development
|
|
- `nginx.conf` - API proxy for production
|
|
|
|
### CLI Tool Environment
|
|
|
|
No environment variables required for CLI tool. All configuration via command-line flags.
|
|
|
|
## When Making Changes
|
|
|
|
### Web Application Changes
|
|
|
|
**Before Modifying API Endpoints (api_server.py):**
|
|
|
|
1. **Always read existing code first** to understand request/response patterns
|
|
2. **Update Pydantic models** if adding new fields
|
|
3. **Add validation** with descriptive error messages
|
|
4. **Update frontend API client** (frontend/src/api.ts) with new types
|
|
5. **Add tests** in tests/test_api_endpoints.py
|
|
6. **Update rate limiting** if adding new endpoint
|
|
7. **Document in API docs** (FastAPI auto-generates from docstrings)
|
|
|
|
**Before Modifying Database Schema (database/schema.sql):**
|
|
|
|
1. **CRITICAL:** Test migration path from existing data
|
|
2. **Add migration logic** to database/init_db.py
|
|
3. **Update CHECK constraints** if changing validation rules
|
|
4. **Add/update indexes** for new query patterns
|
|
5. **Test foreign key cascades** work correctly
|
|
6. **Update tests** in tests/test_integration.py
|
|
7. **Backup production data** before applying
|
|
|
|
Example migration pattern:
|
|
```python
|
|
# database/init_db.py
|
|
def migrate_to_v2(conn):
|
|
"""Add new column with default value."""
|
|
try:
|
|
conn.execute("ALTER TABLE scans ADD COLUMN new_field TEXT DEFAULT 'default'")
|
|
conn.commit()
|
|
except sqlite3.OperationalError:
|
|
# Column already exists, skip
|
|
pass
|
|
```
|
|
|
|
**Before Modifying Frontend Components:**
|
|
|
|
1. **Check TypeScript strict mode** requirements (type-only imports)
|
|
2. **Update API client types** (src/api.ts) if API changed
|
|
3. **Test responsive design** on mobile/tablet/desktop
|
|
4. **Verify error handling** with network failures
|
|
5. **Check accessibility** (keyboard navigation, screen readers)
|
|
6. **Update tests** if adding testable logic
|
|
7. **Verify production build** with `npm run build`
|
|
|
|
**Before Modifying Docker Configuration:**
|
|
|
|
1. **Validate docker-compose.yml** with `docker-compose config`
|
|
2. **Test build** with `docker-compose build --no-cache`
|
|
3. **Verify health checks** work correctly
|
|
4. **Test volume persistence** (database survives restarts)
|
|
5. **Check environment variables** are properly passed
|
|
6. **Update documentation** (`docs/DEPLOYMENT.md`, `docs/DOCKER_README.md`)
|
|
7. **Test full deployment** from scratch
|
|
|
|
**Rate Limiting Changes:**
|
|
|
|
When modifying rate limits:
|
|
1. Update constants in api_server.py
|
|
2. Update .env.example with new defaults
|
|
3. Consider impact on user experience (too strict = frustrated users)
|
|
4. Test with concurrent requests
|
|
5. Document in API response headers
|
|
|
|
**Common Pitfalls:**
|
|
|
|
1. **Forgetting foreign keys:** Add `PRAGMA foreign_keys = ON` to every connection
|
|
2. **Type-only imports:** Use `import type` for interfaces in TypeScript
|
|
3. **JSON arrays:** Remember to `json.loads()` when reading airlines from database
|
|
4. **Rate limiting:** New endpoints need rate limit decorator
|
|
5. **CORS:** Add new origins to ALLOWED_ORIGINS env var
|
|
6. **Cache invalidation:** Frontend may cache old data, handle with ETags or timestamps
|
|
|
|
### CLI Tool Changes
|
|
|
|
**Before Modifying Parser (`searcher_v3.py`)
|
|
|
|
### Before Modifying Parser (`searcher_v3.py`)
|
|
|
|
1. Maintain the layered validation pattern: type check → empty check → None check (see lines 218-302)
|
|
2. Run `pytest tests/test_scan_pipeline.py -m integration` to verify known routes still return flights
|
|
3. Add comprehensive error logging with tracebacks for debugging
|
|
|
|
### Before Modifying Caching (`cache.py`)
|
|
|
|
1. Understand the two-table schema: searches + results
|
|
2. Remember that cache keys include ALL query parameters (origin, destination, date, seat_class, adults)
|
|
3. Test cache invalidation logic with different threshold values
|
|
4. Verify foreign key cascade deletes work correctly
|
|
|
|
### Before Modifying Async Logic (`searcher_v3.py`, `main.py`)
|
|
|
|
1. Respect the sync/async boundary: fast-flights is synchronous, use `asyncio.to_thread()`
|
|
2. Always use semaphores to limit concurrency (prevent rate limiting)
|
|
3. Test with different `--workers` values (1, 3, 5, 10) to verify behavior
|
|
4. Add random delays between requests to avoid anti-bot detection
|
|
|
|
### Before Adding New CLI Arguments (`main.py`)
|
|
|
|
1. Update Click options with proper help text and defaults
|
|
2. Update `README.md` usage examples
|
|
3. Update `PRD.MD` if changing core functionality
|
|
4. Consider cache implications (new parameter = new cache key dimension)
|
|
|
|
---
|
|
|
|
## Project Status
|
|
|
|
### Web Application: ✅ PRODUCTION READY
|
|
|
|
**Completed:** All 30 steps across 4 phases (100% complete)
|
|
|
|
**Phase 1: Backend Foundation** - ✅ 10/10 steps
|
|
- Database schema with triggers and views
|
|
- FastAPI REST API with validation
|
|
- Error handling and rate limiting
|
|
- Startup cleanup for stuck scans
|
|
- Log viewer endpoint
|
|
|
|
**Phase 2: Testing Infrastructure** - ✅ 5/5 steps
|
|
- pytest configuration
|
|
- 43 passing tests (26 unit + 15 integration)
|
|
- 75% code coverage
|
|
- Database isolation in tests
|
|
- Test fixtures and factories
|
|
|
|
**Phase 3: Frontend Development** - ✅ 10/10 steps
|
|
- React + TypeScript app with Vite
|
|
- Tailwind CSS v4 styling
|
|
- 5 pages + 5 components
|
|
- Type-safe API client
|
|
- Error boundary and toast notifications
|
|
- Production build: 293 KB (93 KB gzipped)
|
|
|
|
**Phase 4: Docker Deployment** - ✅ 5/5 steps
|
|
- Multi-stage Docker builds
|
|
- Docker Compose orchestration
|
|
- Nginx reverse proxy
|
|
- Volume persistence
|
|
- Health checks and auto-restart
|
|
|
|
**Quick Start:**
|
|
```bash
|
|
docker-compose up -d
|
|
open http://localhost
|
|
```
|
|
|
|
### CLI Tool: ✅ FUNCTIONAL
|
|
|
|
- Successfully queries Google Flights via fast-flights v3 with SOCS cookie
|
|
- 89% success rate on real flight queries
|
|
- Caching system reduces API calls
|
|
- Seasonal scanning and new route detection
|
|
- Rich terminal output
|
|
|
|
**Known limitations:** fast-flights library parser bug affects ~11% of edge cases (documented in DEBUG_SESSION_2026-02-22_RESOLVED.md)
|
|
|
|
---
|
|
|
|
**Total Project:**
|
|
- ~3,300+ lines of production code
|
|
- ~2,500+ lines of documentation
|
|
- 43/43 tests passing
|
|
- Zero TODO/FIXME comments
|
|
- Docker validated
|
|
- Ready for deployment
|