- New `scheduled_scans` table with daily/weekly/monthly frequencies - asyncio background scheduler loop checks for due schedules every 60s - 6 REST endpoints: CRUD + toggle enabled + run-now - `scheduled_scan_id` FK added to scans table; migrated automatically - Frontend: Schedules page (list + create form), Schedules nav link, "Scheduled" badge on ScanDetails when scan was triggered by a schedule Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
14 KiB
PRD: Scheduled Scans
Status: Draft Date: 2026-02-27 Verdict: Fully feasible — no new dependencies required
1. Problem
Every scan is triggered manually. If you want to track prices for a route over time (e.g. BDS → Germany every Monday) you have to remember to click "Re-run" yourself. Price trends are only discoverable by comparing scan history manually.
2. Goal
Let users define a recurring schedule for any scan configuration. The server runs the scan automatically at the defined cadence, building a historical record of price data over time.
3. User Stories
- As a user, I want to schedule a weekly scan of BDS → Germany so I can see how prices change without manually re-running it.
- As a user, I want to enable/disable a schedule without deleting it.
- As a user, I want to see which scans were created by a schedule and navigate to that schedule from a scan.
- As a user, I want to trigger a scheduled scan immediately without waiting for the next interval.
4. Scheduling Options
Three frequencies are sufficient for flight price tracking:
| Frequency | Parameters | Example |
|---|---|---|
daily |
hour, minute | Every day at 06:00 |
weekly |
day_of_week (0=Mon–6=Sun), hour, minute | Every Monday at 06:00 |
monthly |
day_of_month (1–28), hour, minute | 1st of every month at 06:00 |
Day of month capped at 28 to avoid Feb 29/30/31 edge cases. All times stored and executed in UTC.
5. Architecture
5.1 Scheduler Design
No new dependencies. A simple asyncio background task wakes every 60 seconds, queries the DB for due schedules, and fires a scan for each.
lifespan startup
└── asyncio.create_task(_scheduler_loop())
└── while True:
_check_and_run_due_schedules() # queries DB
await asyncio.sleep(60)
_check_and_run_due_schedules():
SELECT * FROM scheduled_scans WHERE enabled=1 AND next_run_at <= NOW()- For each result, skip if previous scan for this schedule is still
pendingorrunning - Create a new scan row (same INSERT as
POST /scans) - Call
start_scan_processor(scan_id) - Update
last_run_at = NOW()and compute + storenext_run_at
5.2 next_run_at Computation
Precomputed in Python after every run (and on create/update). Stored as a TIMESTAMP column with an index — scheduler lookup is a single indexed range query.
def compute_next_run(frequency, hour, minute,
day_of_week=None, day_of_month=None,
after=None) -> datetime:
now = after or datetime.utcnow()
base = now.replace(hour=hour, minute=minute, second=0, microsecond=0)
if frequency == 'daily':
return base if base > now else base + timedelta(days=1)
elif frequency == 'weekly':
days_ahead = (day_of_week - now.weekday()) % 7
if days_ahead == 0 and base <= now:
days_ahead = 7
return (now + timedelta(days=days_ahead)).replace(
hour=hour, minute=minute, second=0, microsecond=0)
elif frequency == 'monthly':
candidate = now.replace(day=day_of_month, hour=hour, minute=minute, second=0, microsecond=0)
if candidate <= now:
m, y = (now.month % 12) + 1, now.year + (1 if now.month == 12 else 0)
candidate = candidate.replace(year=y, month=m)
return candidate
6. Schema Changes
6.1 New table: scheduled_scans
CREATE TABLE IF NOT EXISTS scheduled_scans (
id INTEGER PRIMARY KEY AUTOINCREMENT,
-- Scan parameters
origin TEXT NOT NULL CHECK(length(origin) = 3),
country TEXT NOT NULL CHECK(length(country) >= 2),
window_months INTEGER NOT NULL DEFAULT 1
CHECK(window_months >= 1 AND window_months <= 12),
seat_class TEXT NOT NULL DEFAULT 'economy',
adults INTEGER NOT NULL DEFAULT 1
CHECK(adults > 0 AND adults <= 9),
-- Schedule definition
frequency TEXT NOT NULL
CHECK(frequency IN ('daily', 'weekly', 'monthly')),
hour INTEGER NOT NULL DEFAULT 6
CHECK(hour >= 0 AND hour <= 23),
minute INTEGER NOT NULL DEFAULT 0
CHECK(minute >= 0 AND minute <= 59),
day_of_week INTEGER CHECK(day_of_week >= 0 AND day_of_week <= 6),
day_of_month INTEGER CHECK(day_of_month >= 1 AND day_of_month <= 28),
-- State
enabled INTEGER NOT NULL DEFAULT 1,
label TEXT,
last_run_at TIMESTAMP,
next_run_at TIMESTAMP NOT NULL,
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
-- Frequency-specific constraints
CHECK(
(frequency = 'weekly' AND day_of_week IS NOT NULL) OR
(frequency = 'monthly' AND day_of_month IS NOT NULL) OR
(frequency = 'daily')
)
);
-- Fast lookup of due schedules
CREATE UNIQUE INDEX IF NOT EXISTS uq_scheduled_scans_id
ON scheduled_scans(id);
CREATE INDEX IF NOT EXISTS idx_scheduled_scans_next_run
ON scheduled_scans(next_run_at)
WHERE enabled = 1;
-- Auto-update updated_at
CREATE TRIGGER IF NOT EXISTS update_scheduled_scans_timestamp
AFTER UPDATE ON scheduled_scans
FOR EACH ROW BEGIN
UPDATE scheduled_scans SET updated_at = CURRENT_TIMESTAMP WHERE id = NEW.id;
END;
-- Insert schema version bump
INSERT OR IGNORE INTO schema_version (version, description)
VALUES (2, 'Add scheduled_scans table');
6.2 Add FK column to scans
-- Migration: add scheduled_scan_id to scans
ALTER TABLE scans ADD COLUMN scheduled_scan_id INTEGER
REFERENCES scheduled_scans(id) ON DELETE SET NULL;
CREATE INDEX IF NOT EXISTS idx_scans_scheduled_scan_id
ON scans(scheduled_scan_id)
WHERE scheduled_scan_id IS NOT NULL;
7. Migration (database/init_db.py)
Add two migration functions, called before executescript(schema_sql):
def _migrate_add_scheduled_scans(conn, verbose=True):
"""Migration: create scheduled_scans table and add FK to scans."""
cursor = conn.execute(
"SELECT name FROM sqlite_master WHERE type='table' AND name='scheduled_scans'"
)
if cursor.fetchone():
return # Already exists
if verbose:
print(" 🔄 Migrating: adding scheduled_scans table...")
conn.execute("""
CREATE TABLE scheduled_scans (
id INTEGER PRIMARY KEY AUTOINCREMENT, ...
)
""")
# Add scheduled_scan_id to existing scans table
try:
conn.execute("ALTER TABLE scans ADD COLUMN scheduled_scan_id INTEGER REFERENCES scheduled_scans(id) ON DELETE SET NULL")
except sqlite3.OperationalError:
pass # Column already exists
conn.execute("CREATE INDEX IF NOT EXISTS idx_scans_scheduled_scan_id ON scans(scheduled_scan_id) WHERE scheduled_scan_id IS NOT NULL")
conn.commit()
if verbose:
print(" ✅ Migration complete: scheduled_scans table created")
8. API Endpoints
All under /api/v1/schedules. Rate limit: 30 req/min per IP (same as scans list).
| Method | Path | Description |
|---|---|---|
GET |
/schedules |
List all schedules (paginated) |
POST |
/schedules |
Create a schedule |
GET |
/schedules/{id} |
Schedule details + last 5 scan IDs |
PATCH |
/schedules/{id} |
Update (enable/disable, change frequency/params) |
DELETE |
/schedules/{id} |
Delete schedule (scans are kept, FK set to NULL) |
POST |
/schedules/{id}/run-now |
Trigger immediately (ignores next_run_at) |
Request model: CreateScheduleRequest
class CreateScheduleRequest(BaseModel):
origin: str # 3-char IATA
country: Optional[str] # 2-letter ISO country code
destinations: Optional[List[str]] # Alternative: list of IATA codes
window_months: int = 1 # Weeks of data per scan run
seat_class: str = 'economy'
adults: int = 1
label: Optional[str] # Human-readable name
frequency: str # 'daily' | 'weekly' | 'monthly'
hour: int = 6 # UTC hour (0–23)
minute: int = 0 # UTC minute (0–59)
day_of_week: Optional[int] # Required when frequency='weekly' (0=Mon)
day_of_month: Optional[int] # Required when frequency='monthly' (1–28)
Response model: Schedule
class Schedule(BaseModel):
id: int
origin: str
country: str
window_months: int
seat_class: str
adults: int
label: Optional[str]
frequency: str
hour: int
minute: int
day_of_week: Optional[int]
day_of_month: Optional[int]
enabled: bool
last_run_at: Optional[str]
next_run_at: str
created_at: str
recent_scan_ids: List[int] # Last 5 scans created by this schedule
9. Scheduler Lifecycle (api_server.py)
9.1 Startup
In the existing lifespan() context manager, after existing startup code:
scheduler_task = asyncio.create_task(_scheduler_loop())
logger.info("Scheduled scan background task started")
yield
scheduler_task.cancel()
try:
await scheduler_task
except asyncio.CancelledError:
pass
9.2 Missed runs on restart
When the server starts, _check_and_run_due_schedules() fires immediately (before the 60-second sleep), catching any schedules that were due while the server was down. Each overdue schedule runs exactly once — next_run_at is then advanced to the next future interval. Multiple missed intervals are not caught up.
9.3 Concurrency guard
Before firing a scan for a schedule, check:
running = conn.execute("""
SELECT id FROM scans
WHERE scheduled_scan_id = ? AND status IN ('pending', 'running')
""", (schedule_id,)).fetchone()
if running:
logger.info(f"Schedule {schedule_id}: previous scan {running[0]} still active, skipping this run")
# Still advance next_run_at so we try again next interval
continue
10. Frontend Changes
10.1 New page: Schedules.tsx
List view:
- Table of all schedules: label, origin → country, frequency, next run (local time), last run, enabled toggle
- "New Schedule" button opens create form (same airport search component as Scans)
- Inline enable/disable toggle (PATCH request, optimistic update)
- "Run now" button per row
Create form fields (below existing scan form fields):
- Frequency selector: Daily / Weekly / Monthly (segmented button)
- Time of day: hour:minute picker (UTC, with note)
- Day of week (shown only for Weekly): Mon–Sun selector
- Day of month (shown only for Monthly): 1–28 number input
- Optional label field
10.2 Modified: ScanDetails.tsx
When a scan has scheduled_scan_id, show a small "Scheduled" chip in the header with a link to /schedules/{scheduled_scan_id}.
10.3 Navigation (Layout.tsx)
Add "Schedules" link to sidebar between Scans and Airports.
10.4 API client (api.ts)
export interface Schedule {
id: number;
origin: string;
country: string;
window_months: number;
seat_class: string;
adults: number;
label?: string;
frequency: 'daily' | 'weekly' | 'monthly';
hour: number;
minute: number;
day_of_week?: number;
day_of_month?: number;
enabled: boolean;
last_run_at?: string;
next_run_at: string;
created_at: string;
recent_scan_ids: number[];
}
export const scheduleApi = {
list: (page = 1, limit = 20) =>
api.get<PaginatedResponse<Schedule>>('/schedules', { params: { page, limit } }),
get: (id: number) =>
api.get<Schedule>(`/schedules/${id}`),
create: (data: CreateScheduleRequest) =>
api.post<Schedule>('/schedules', data),
update: (id: number, data: Partial<CreateScheduleRequest> & { enabled?: boolean }) =>
api.patch<Schedule>(`/schedules/${id}`, data),
delete: (id: number) =>
api.delete(`/schedules/${id}`),
runNow: (id: number) =>
api.post<{ scan_id: number }>(`/schedules/${id}/run-now`),
};
11. Edge Cases
| Case | Handling |
|---|---|
| Previous scan still running at next interval | Skip this interval's run, advance next_run_at, log warning |
| Server down when schedule is due | On startup, runs any overdue schedule once; does not catch up multiple missed intervals |
| Schedule deleted while scan is running | ON DELETE SET NULL on FK — scan continues, scheduled_scan_id becomes NULL |
window_months covers past dates |
Scan start date is always "tomorrow" at creation time, same as manual scans |
| Monthly with day_of_month=29..31 | Capped at 28 in validation — avoids invalid dates in all months |
| Simultaneous due schedules | Each creates an independent asyncio task; existing max_workers=3 semaphore in scan_processor limits total API concurrency across all running scans |
| Schedule created at 05:59, fires at 06:00 UTC | next_run_at is computed at creation time — if 06:00 today already passed, fires tomorrow |
12. Files Changed
| File | Change |
|---|---|
database/schema.sql |
Add scheduled_scans table, trigger, indexes, schema_version bump |
database/init_db.py |
_migrate_add_scheduled_scans() + call in initialize_database() |
api_server.py |
compute_next_run(), _scheduler_loop(), _check_and_run_due_schedules(), 6 new endpoints, lifespan update, new Pydantic models |
frontend/src/api.ts |
Schedule type, CreateScheduleRequest type, scheduleApi object |
frontend/src/pages/Schedules.tsx |
New page (list + inline create form) |
frontend/src/pages/ScanDetails.tsx |
"Scheduled" badge + link when scheduled_scan_id present |
frontend/src/components/Layout.tsx |
Schedules nav link |
Total: 7 files. Estimated ~500 new lines (backend ~250, frontend ~250).
13. Out of Scope
- Notifications / alerts when a scheduled scan completes (email, webhook)
- Per-schedule price change detection / diffing between runs
- Timezone-aware scheduling (all times UTC for now)
- Pause/resume of scheduled scans (separate PRD)
- Rate limiting across simultaneous scheduled scans (existing semaphore provides soft protection)
- Dashboard widgets for upcoming scheduled runs