FB
F-lightBook Documentation

AI Capability Strategy & Detailed Feature Documentation

Project: Flight Booking Meta‑Search Platform (FP-01-TicketBooking)
Document Type: Multi‑page AI Product & Engineering Specification (Markdown)
Version: 1.0
Date: 2026-02-26
Owner: Product + Architecture (AI Workstream)


Document Control

Field Value
Product Flight Booking Meta‑Search Platform
Scope Baseline SOW + scope-frozen feature list (Search → Compare → Verify → Redirect + Alerts + SEO + Admin + Partner Mgmt)
Primary Goals Increase conversion, improve trust, reduce cost per search, increase monetization yield, strengthen defensibility
Non‑Goals Turning meta-search into an OTA; storing/issuing tickets; post‑booking servicing beyond assistive guidance
Status Draft for engineering planning
Change Policy Scope changes require explicit stakeholder approval

1. Executive Summary

This document defines AI capabilities that can be integrated into the Flight Booking Meta‑Search Platform to:

  1. Increase user conversion (click‑out CTR, booking yield, repeat usage)
  2. Improve profitability (ad yield, supplier optimization, fraud protection, reduced infra cost)
  3. Create differentiation (UX trust, ranking quality, personalization, predictive insights)
  4. Scale operations (partner onboarding, support tooling, SEO production, analytics copilots)

The approach is not “AI everywhere”. It is a targeted AI portfolio that strengthens the platform’s core loops:

  • Search request → supplier fan‑out → unify offers → rank/sort → display → price verify → redirect
  • Alerts → engagement → return sessions
  • Partner management → provider quality → monetization → reconciliation
  • SEO pages → demand acquisition → search sessions
  • Analytics → continuous improvement

2. Principles & Guardrails

2.1 Platform AI Principles

  1. Deterministic core, probabilistic assist
    • AI can propose / rank / explain, but the system enforces schema, policy, and safety.
  2. No hallucinations in transactional surfaces
    • Any user-facing “facts” must be derived from: supplier responses, fare rules, verified aggregates, or explicit user inputs.
  3. Traceability
    • Every AI decision should be logged with: model version, inputs (hashed/redacted), outputs, confidence, and downstream impact.
  4. User trust beats short-term revenue
    • Sponsored placements and monetization optimization must respect relevance and quality constraints.
  5. Privacy-first
    • Minimize PII usage; use pseudonymous IDs; implement retention controls and access policies.

2.2 Safety / Compliance Guardrails

  • Personal data: Do not send raw PII to third-party models unless contractually approved; tokenize/ redact where possible.
  • Model outputs: enforce JSON schema validation for structured outputs; reject and fallback on parse failures.
  • Explainability: avoid disallowed medical/legal/financial claims; keep price predictions and advice qualified and confidence-based.
  • Security: protect prompts, keys, and model endpoints; rate-limit; detect prompt injection in any content-based flows.

3. AI Capability Map (From All Directions)

This section lists AI features grouped by business outcome.

A. User-Facing Differentiators (Conversion + Retention)

  1. Natural Language Flight Search (NL→Query)
  2. Personalized “Best Value” Ranking (LTR + preferences)
  3. Offer Explanations (“Why this”, “What’s included”)
  4. Price Prediction (“Book now / Wait” with confidence)
  5. Smart Alerts (meaningful notifications, suppression)
  6. Trip Assistant (pre/post click-out guidance)

B. Profit & Unit Economics (Revenue ↑, Cost ↓)

  1. Supplier Call Optimization (reduce wasted fan‑out)
  2. Provider Quality Scoring & Suppression
  3. Sponsored Placement Optimizer (relevant monetization)
  4. Fraud/IVT Detection (protect CPC/CPA + partners)
  5. Price Mismatch Prediction (label stability, reduce disputes)

C. Internal Ops & Moat (Scale + Speed)

  1. Support Copilot (session timeline aware)
  2. Partner Onboarding Copilot (mapping + tests)
  3. AI-Assisted SEO at Scale (template + data-grounding)
  4. Analytics Copilot in Admin (ask-your-data + actioning)

4. Shared Data Foundation (Required for Most AI Features)

4.1 Event Taxonomy (Minimum)

Search Session Events

  • search_id (UUID), user_id (pseudonymous), device_id (hash)
  • origin, destination, dates (exact or flex), pax, cabin
  • filters applied, sort chosen, currency, locale
  • timestamp + geo inference (coarse)

Offer / Impression Events

  • offer_id, provider_id, rank_position, price_total, taxes_fees, baggage, refundability
  • duration, stops, departure/arrival times
  • impression_id, visible modules, sponsored flags

Click-Out / Redirect Events

  • click_id, offer_id, provider_id, deep link params, timestamp
  • redirect success/failure reason, latency

Postback / Conversion (Where Available)

  • booking confirmed, value, commission, timestamp, provider confirmation
  • attribution windows + dedupe keys

Quality / Issue Signals

  • price mismatch detected (pre-redirect verify vs provider landing vs user feedback)
  • hidden fees complaints / landing errors
  • support tickets tags

Maintain a feature store (logical or physical) to feed ranking, predictions, and anomaly detection.

Example feature groups

  • User preferences: airline affinity, departure-time preference, stop tolerance
  • Route stats: typical price bands, seasonal spikes, average duration
  • Provider stats: mismatch rate, latency distribution, conversion rate
  • Offer stats: price delta vs median, historical volatility
  • Fraud stats: click burstiness, IP/device repetition, bot signals

5. Reference Architecture for AI Integration

5.1 High-Level Components

  • AI Gateway Service
    • centralizes model access (LLM + ML)
    • provides auth, rate-limits, redaction, logging, caching
  • Ranking Service
    • executes deterministic + ML ranking
    • enforces policy constraints
  • Prediction Service
    • price movement, mismatch probability, conversion likelihood
  • Quality Service
    • provider scoring, complaint aggregation, suppression rules
  • Fraud Service
    • anomaly detection and risk scoring
  • Admin Copilots
    • analytics + support assistant tools with safe access controls

5.2 Critical Pattern: “AI Proposes, System Decides”

  1. AI suggests structure/ranking/explanations
  2. Backend validates against strict schema and policy
  3. System logs the decision, confidence, and outcome
  4. Fallbacks exist for any model failure

5.3 Observability Requirements

Log fields (minimum):

  • ai_model, ai_model_version, prompt_template_id
  • input_hash, output_hash, confidence, latency_ms
  • decision_path: ai_applied / fallback_used
  • business outcomes: CTR delta, conversion delta, mismatch delta

6. Detailed Feature Specifications

Each feature includes: goal, UX, inputs, outputs, algorithm, guardrails, metrics, and rollout plan.


6.1 Natural Language Flight Search (NL→Query)

Goal

Reduce friction at the top of funnel by enabling users to describe intent in plain language and convert it into structured search parameters.

User Stories

  • As a user, I can type “cheapest weekend next month DXB→IST no long layovers” and see results.
  • As a user, I can say “leave after 6pm and return Monday morning”.
  • As a user, I can correct the AI: “actually 2 adults and 1 child” and it updates search.

UX Requirements

  • Add a “Try natural language” placeholder + example chips.
  • Show parsed parameters above results for transparency and editing.
  • If parsing fails, show a gentle fallback to classic search form.

Inputs

  • User text query + locale + currency + optional location context (home airport).

Outputs

  • Structured JSON:
    • origin, destination, date(s) or flex window
    • pax breakdown, cabin, max stops, max layover duration, time-of-day constraints
    • budget if provided

Implementation Approach

  • LLM-based parser with constrained decoding into schema.
  • Validation layer corrects invalid airports/dates; fallback suggestions for ambiguity.

Guardrails

  • Never call suppliers with unvalidated params.
  • Do not fabricate airport codes; must match known airport dataset.

Success Metrics

  • Search completion rate ↑
  • Time-to-first-results ↓
  • CTR ↑ vs control for NL cohort

Rollout

  • Phase 1: Beta behind feature flag; collect parse accuracy stats.
  • Phase 2: Default for a subset of traffic; add personalization (home airport).

6.2 Personalized “Best Value” Ranking (Learning-to-Rank)

Goal

Increase CTR and downstream booking yield by ranking offers based on utility rather than price only.

Key Idea

Optimize for a combined objective:

  • Predicted CTR × Predicted conversion × Expected commission − Penalties (mismatch/latency/poor quality)

Inputs

  • Offer features: price, duration, stops, baggage, times, airline, provider
  • User features: preference embeddings or rules (if consented)
  • Provider quality features (mismatch rate, conversion history)

Outputs

  • Ordered list of offers + “why” explanation tags.

Algorithm Options

  1. Heuristic baseline (immediate):
    • Weighted scoring function
  2. Gradient boosted ranking (mid-term):
    • LambdaMART / XGBoost rank
  3. Deep ranking (later):
    • embeddings for user intent and itinerary similarity

Guardrails

  • Hard constraints: if mismatch probability above threshold, demote or label.
  • Sponsored offers must pass relevance and quality floors.
  • Always allow user to switch to Price/Duration sorts.

Metrics

  • CTR @ top 3 ↑
  • Booking yield ↑
  • Mismatch complaints ↓
  • Revenue per search ↑

6.3 Offer Explanations (“Why this price?”, “What’s included?”)

Goal

Build trust and reduce decision anxiety, improving click-out and retention.

UX

  • Inline tags: “Includes 20kg bag”, “Short layover”, “Refundable”
  • Expandable “Why this price” panel with 3–5 grounded bullet points.

Inputs

  • Fare rules (where available)
  • Historical route price bands
  • Offer features (timing, seasonality proxies)

Output

  • Short natural language explanation + structured tags.

Guardrails

  • Explanations must reference measurable signals (e.g., “weekend”, “short notice”) rather than invented reasons.
  • If confidence low or signals missing, show “We don’t have enough data to explain price changes.”

Metrics

  • CTR ↑
  • Bounce ↓
  • Support tickets about “confusing pricing” ↓

6.4 Price Prediction (“Book Now / Wait” + Confidence)

Goal

Help users decide and drive alert subscriptions and earlier conversion.

Inputs

  • Route/date price history
  • Days-to-departure
  • Seasonality features, event markers if available
  • Price volatility and trend

Outputs

  • Recommendation: BUY / WAIT / UNCERTAIN
  • Confidence: Low/Medium/High
  • Short reason string (grounded)

Model Approach

  • Baseline: quantile regression + volatility band
  • Later: time-series models (Prophet-like or gradient boosting with lag features)

Guardrails

  • Always phrased probabilistically; never “guarantees”.
  • Show confidence; allow user to set alert instead of decision.

Metrics

  • Alert opt-in ↑
  • Time-to-book ↓
  • Return session rate ↑

6.5 Smart Alerts (Meaningful Notifications)

Goal

Reduce notification fatigue and increase alert-to-booking conversion.

Inputs

  • User alert rules (route/date/cabin/price)
  • Predicted meaningful change threshold
  • User engagement propensity

Outputs

  • When to notify, what to say, what to recommend (alternative dates/routes)

Guardrails

  • Respect quiet hours & caps.
  • Never spam; if multiple changes, summarize.

Metrics

  • Alert CTR ↑
  • Unsubscribe rate ↓
  • Alert→booking yield ↑

6.6 Trip Assistant (Pre/Post Click-out Guidance)

Goal

Add value without becoming an OTA.

Capabilities

  • Reminders: baggage, visa considerations (generic), airport transfer suggestions
  • Summaries: itinerary recap, key constraints to watch on provider page

Guardrails

  • No authoritative visa guarantees; only guidance and links (if available in your content system).
  • Never claim booking status unless confirmed via postback.

7. Profit & Unit-Economics AI Features

7.1 Supplier Call Optimization (Fan-out Planner)

Goal

Reduce cost per search and latency while preserving result quality.

Inputs

  • Query features (route, date range, cabin)
  • Supplier coverage likelihood
  • Expected competitiveness, latency, failure rate
  • Expected yield (commission propensity)

Output

  • A ranked set of suppliers to call, with budgets and timeout strategy.

Implementation

  • Start rule-based with learned priors.
  • Move to bandit / reinforcement-like exploration: occasionally sample less-used suppliers to refresh priors.

Metrics

  • Calls per search ↓
  • P95 latency ↓
  • Result coverage maintained (or ↑)
  • Revenue per infra $ ↑

7.2 Provider Quality Scoring & Suppression

Goal

Prevent “bad providers” from degrading trust and metrics.

Signals

  • mismatch rate, redirect errors, hidden fees complaints
  • conversion rate, support ticket tags, postback discrepancies

Actions

  • demote, label, throttle, or suppress provider
  • apply stricter verification requirements for low-quality providers

Metrics

  • Mismatch complaints ↓
  • CTR stable or ↑
  • Partner disputes ↓

7.3 Sponsored Placement Optimizer

Goal

Increase monetization while preserving relevance and trust.

Inputs

  • predicted CTR, predicted conversion, expected commission/CPC
  • relevance match score between query intent and sponsored offer
  • provider quality score

Guardrails

  • Relevance floor required
  • Quality floor required
  • Frequency caps per session/user

Metrics

  • RPM ↑
  • Retention stable
  • Complaint rate stable

7.4 Fraud / IVT Detection

Goal

Protect ad budgets and partner trust.

Signals

  • click bursts, abnormal IP/device patterns
  • bot heuristics, impossible geo movement, repetitive paths
  • mismatch between clicks and postbacks

Outputs

  • risk score per click/session
  • actions: throttle, challenge, exclude from billing, flag for review

Metrics

  • Invalid click rate ↓
  • Partner payout disputes ↓
  • Net revenue ↑

7.5 Price Mismatch Prediction (“Price stability” labels)

Goal

Proactively manage expectations and reduce support load.

Signals

  • provider mismatch history for similar offers
  • volatility indicators
  • cache age and verification freshness

UX

  • label: “Price likely to change” vs “Price stable”
  • encourage refresh/verify before redirect when risk is high

Metrics

  • mismatch incidents ↓
  • support volume ↓
  • trust/CSAT ↑

8. Internal Ops AI Features

8.1 Support Copilot (Session-Aware)

Goal

Reduce support handling time and improve answer correctness.

Inputs

  • session timeline (search → click → redirect → verify)
  • provider selected, click-id, error codes, mismatch signals

Outputs

  • suggested response drafts (templated) + next steps
  • escalation triggers

Guardrails

  • No hallucination: response must cite internal facts (click-id, timestamps).
  • Separate “user-visible” vs “internal-only” notes.

Metrics

  • AHT ↓
  • Resolution rate ↑
  • Escalations ↓

8.2 Partner Onboarding Copilot

Goal

Speed integrations and reduce mapping errors.

Capabilities

  • parse partner API docs into required fields mapping
  • generate integration checklist
  • generate QA test cases for price verify + redirect
  • create sample request/response stubs

Metrics

  • Time-to-integrate ↓
  • Integration defects ↓

8.3 AI-Assisted SEO at Scale

Goal

Create route/city pages and FAQs efficiently without thin content.

Approach

  • Template-first + AI fill-in blocks
  • Ground to aggregated stats and internal data
  • Human review workflow for new templates

Metrics

  • Organic sessions ↑
  • Indexation quality ↑
  • Low-quality page ratio ↓

8.4 Analytics Copilot in Admin

Goal

Make internal analytics actionable without deep SQL.

Use Cases

  • “Why did CTR drop for route X in last 7 days?”
  • “Which provider has highest mismatch rate yesterday?”
  • “Suggest ranking weight changes or suppression candidates.”

Guardrails

  • Role-based access
  • Only allow queries over approved semantic metrics
  • Provide citations: dashboards/metric IDs used

9. Metrics & Experimentation Framework

9.1 North Star Metrics

  • Search → results render success rate
  • CTR (click-out) per search
  • Booking yield (where postback exists)
  • Revenue per search
  • Mismatch rate
  • P95 latency and infra cost per search

9.2 A/B Testing Requirements

  • Feature flags by cohort
  • Experiment assignment persisted per user/session
  • Pre-registered success metrics and guardrails
  • Stop-loss conditions (e.g., mismatch spikes)

9.3 Offline Evaluation

  • Ranking: NDCG@k, MAP, calibration of predicted conversion
  • Predictions: MAE / MAPE + directional accuracy
  • Fraud: precision/recall tradeoffs + cost-weighted evaluation

10. Rollout Plan (Practical ROI Order)

Phase 1 (Fast ROI / Low Regret)

  1. Supplier call optimization (cost/search ↓)
  2. Provider quality scoring + suppression
  3. Fraud/IVT detection
  4. Offer explanations (trust UX)

Phase 2 (Differentiation)

  1. Natural language search
  2. Price prediction (buy/wait)
  3. Sponsored placement optimizer with strict guardrails

Phase 3 (Moat)

  1. Personalization flywheel (ranker + alerts)
  2. B2B insights product (aggregated trends)
  3. Mature admin copilots with self-serve experimentation

11. Backlog (Engineering-Friendly Epics)

Epic E1: AI Gateway

  • model provider abstraction
  • redaction layer
  • JSON schema enforcement for structured outputs
  • audit logs + metrics

Epic E2: Ranking v1

  • heuristic scorer + feature store
  • quality penalties, sponsored constraints
  • A/B testing harness

Epic E3: Supplier Planner

  • coverage priors + rules
  • instrumentation for outcome feedback

Epic E4: Explanations

  • tag extraction from fare rules
  • LLM summarization with grounding and fallbacks

Epic E5: Fraud & Quality

  • baseline anomaly rules + scoring
  • dashboards + alerting
  • prompt + schema parser
  • airport/date validation + correction flows

Epic E7: Price Predictions

  • historical pricing pipeline
  • confidence-calibrated outputs + UI

Epic E8: Admin Copilot

  • semantic metric layer
  • safe query templates
  • role-based access controls

12. Risk Register

Risk Impact Mitigation
AI hallucinations in explanations Trust loss Grounding + schema + fallbacks
Over-monetization hurting retention Long-term revenue loss Relevance/quality floors + caps
Model latency UX degradation caching + async + timeouts
Data sparsity in predictions Wrong advice show confidence; “uncertain” state
Privacy leakage Legal/security redaction, minimization, access controls
Supplier planner reduces coverage Lower conversion explore/exploit + monitoring coverage KPIs

13. Appendices

13.1 “Definition of Done” for Any AI Feature

  • Measurable objective and primary metrics defined
  • Offline evaluation completed (where applicable)
  • A/B test plan and guardrails set
  • Observability fields logged
  • Fallback behavior implemented
  • Privacy review completed
  • Runbook created (alerts, dashboards, rollbacks)

13.2 Suggested UI Copy Patterns (Trust-Safe)

  • “Prices may change quickly. We’ll verify before redirect.”
  • “Recommendation based on historical trends. Confidence: Medium.”
  • “We don’t have enough data to predict this route yet.”

End of Document

Last modified: Feb 26, 2026 by George Joseph (a4fadf9)