prd draft

Autopilot 2.0: Autonomous Revenue Management at Enterprise Scale

Sujoy Guha Updated 2026-03-11
pricing autonomous-pricing prd q1-2026

PRD: Autopilot 2.0 - Autonomous Revenue Management at Enterprise Scale

Canonical source: Confluence (page 3281551374)

1. Overview

1.1 Problem Statement

Duetto's pricing engine recommends. Revenue managers decide. At enterprise scale, that model breaks. 62% of hotels override more than half the recommendations. Not because they distrust the system, but because they see things it does not: competitor moves, demand shocks, local events, portfolio context. Every override is a signal we collect and discard. 5.9 million manual overrides per quarter, all discarded.

Meanwhile, Lighthouse shipped a free autonomous pricing agent to 80,000 hotels. Sabre is shipping agentic APIs. Atomize runs autonomous pricing 24/7. The enterprise tier is the last segment without it. The window is 2026.

1.2 Solution Summary

Autopilot 2.0 inverts the model: the system decides, revenue managers govern. Four capabilities built incrementally:

  1. Sense: real-time signal ingestion (comp rates, booking pace, overrides, events, demand shocks)
  2. Decide: RL-powered pricing intelligence (LP demand models + RL simulation convergence)
  3. Act: orchestration through existing RM controls (autopilot rules, floors/ceilings, demand_scale)
  4. Govern: risk-tiered autonomy (auto-publish when confident, escalate with reasoning when uncertain)

1.3 Success Metrics

Tier Metric Baseline Target
North Star RevPAR Current Stable or positive vs baseline
North Star Rate Acceptance 36.7% fleet (excl AUTOPILOT) +5pp for pilot hotels
Explanatory DDR Current Directional improvement
Explanatory Override Frequency 48% fleet manual rate Directional reduction for pilots
Explanatory ADR + Lift Current Catches rate erosion edge case
Kill Occupancy % Current No significant drop while rates rise
Kill Pickup vs Expected Current If accepted recs don't sell, demand model is wrong
Kill RevPAR degradation -- >2% for any pilot hotel: immediate revert

2. The Problem Data

Override Fleet Metrics (Athena, 90-day, excl AUTOPILOT)

Metric Value
Fleet manual override rate 48% of human-mediated decisions
Hotels with >50% override rate 3,656 / 5,895 (62%)
Median hotel override rate 69%
Fleet Rate Acceptance (3% threshold) 36.7%
Override-churn correlation None (active 54.1%, inactive 55.2%)

Four Override Populations (Experiment E-2026-PRC-002, validated)

Population Hotels Behavior Agent Strategy
Near-total 1,417 (24%) 99% manual, +24% deviation Model too conservative. Calibrate upward per portfolio.
Bidirectional 1,286 (21%) 69% manual, +7% net Fine-tuning. Pre-adjust to eliminate manual step.
Directional 195 (3%) 84% manual, +206% deviation Calibration failures. Flag for retraining.
Low 2,990 (50%) 25% manual, +14% deviation Model works. Autonomy candidates.

Portfolio signal: override behavior clusters by company, not hotel (Emaar 22 hotels 100%, Under Canvas 17 100%). Agent must learn at portfolio level.

ICP Segments (Jon's canonical filters, refreshed 2026-03-03)

Segment Hotels RA Agent Play
Champions 1,541 (26%) 66.6% Autonomy candidates. Deploy auto-publish first.
Accepting but Manual 1,756 (29%) 48.3% Biggest ROI. Override learning closes the gap.
Swing Voters 736 (12%) 22.8% Comp-responsive pricing tips them over.
Manual Operators 1,783 (30%) 2.3% Calibration + pooling needed.
Others 173 (3%) -- Monitor.

3. Architecture: The Autonomous Pricing Agent

Sense: Signal Ingestion

  • Competitor rates: 7 features in model, cross-price elasticity for 7,484 hotels. Gap: batch-only.
  • Booking pace / cancellations: Mexico crisis exposed lack of demand shock response.
  • Override behavior: 48% manual. Each override encodes what the RM saw that the model missed.
  • Events: PredictHQ featurizer merged (PR #427). Events are model inputs.
  • Demand shocks: demand_scale via Eppo flag operational (PR #512, #515). No automated trigger.

Decide: LP/RL Convergence

Andrew's LP demand models (ml_elasticity): v1.1.0 in production. 65+ features. NeuralODE in development (7 PRs merged). Pooled training (PR #439) for 401 hotels across 89 owner groups. LP optimizer (LambdaMILP) with imitation_buffer for 981 hotels.

Cameron's RL simulation (revmax): JAX-based simulation with PPO policies (PR #37, #44). GPU-speed via jit/vmap. Needs real hotel calibration.

Convergence path: Phase 0-1 runs on LP. Phase 2 evaluates RL in shadow mode against LP. If RL outperforms, it becomes the inference engine. If not, LP continues with override-aware enhancements.

Act: Through Existing Controls

Control Today Agent Action
Autopilot Rules Binary on/off, static Confidence-based: tighten when uncertain, loosen as trust builds
Overrides RM types rate, system ignores Agent learns patterns and pre-adjusts
Comp Rates Displayed, never enters engine First-class inputs via cross-price elasticity
demand_scale Manual Eppo flag Auto-adjusted on cancellation velocity / pace anomalies
Floors / Ceilings Static per segment Dynamic bounds responding to comp positioning + demand

Govern: Risk-Tiered Autonomy

  • Auto-publish (80%): high confidence, small deviation, familiar pattern
  • Escalate with explanation (20%): large rate changes, new patterns, edge cases. Top-3 factor attribution.
  • Configurable per hotel: conservative portfolios start tight, high-trust portfolios start broad

4. The Plan

What Exists (not a cold start)

Component Status
Comp data in model (7 features, 7,484 hotels) DONE
PredictHQ events featurizer (PR #427) DONE
Rate Recommendation API (LambdaMILP, 981 hotels) DONE
Demand shock response (demand_scale Eppo flag) PARTIAL
Constraint extraction (AE-2331/2332 spikes) PARTIAL
NeuralODE architecture (7 PRs merged) IN PROGRESS
RL simulation revmax (JAX/PPO merged) IN PROGRESS
Pooled training (PR #439, legal cleared) IN PROGRESS
Override-aware pricing (user_override not featurized) NOT STARTED
Event-driven re-optimization NOT STARTED
Explainability (top-3 attribution) NOT STARTED

Phase 0: Prove the Model (Q1, now)

981 hotels. ml_elasticity v1.1.0 vs legacy Pricerator.

Workstream Owner Deliverable
981-hotel training Andrew / DS, Suraj / MLP v1.1.0 in ml-prod. Temporal comparison.
Optimizer harmonization Andrew / DS Align expected_pickup (PR #52)
Eppo metric definitions Woojong / PA, Irving / AE Assignment SQL + 6 metrics. Blocked on DE Redshift.
Shadow analysis Woojong / PA 3-way: legacy vs constrained vs unconstrained

Kill: RA degrades >5pp vs baseline.

Phase 1: Scale + Differentiate (Q2)

2,000 hotels. Pooling, override learning, RL calibration, co-innovation partners.

Workstream Owner Deliverable
2K-hotel experiment Andrew / DS, Suraj / MLP Pooling + 3 constraint flavors. Stay-night randomization.
RL simulation calibration Cameron / DS Fit revmax to real hotel patterns. Validate transferability.
Override featurization Andrew / DS, Prerana / PA Featurize user_override. P(override) prototype.
Co-innovation pilots Sujoy / PM Strawberry (80), Sandman, Pastana. 50 hotels with RM feedback.

Kill: Pooling worsens RA. RL sim not transferable.

Phase 2: Autonomous Pricing (Q3-Q4)

Risk-tiered autonomy for co-innovation hotels.

Workstream Owner Deliverable
Override-aware pricing Andrew / DS RA model as optimizer constraint. P(override) from Phase 1 data.
RL policy deployment Cameron / DS, Hakim / MLP RL replaces LP for pilots (if validated). JAX inference <1s.
Risk-tiered autonomy engine Product + AE Auto-publish / escalate classification. Rate Rationale.
Event-driven repricing Everton / AE Queue architecture for trigger-based re-optimization.

Kill: >10% override rate or >2% RevPAR degradation on auto-published decisions.

5. Resource Ask

Role Who Focus Through
DS: LP + Override Andrew Crane-Droesch ml_elasticity, NeuralODE, pooling, override-aware pricing Q4 2026
DS: RL + Simulation Cameron Young revmax calibration, PPO training, autonomous inference Q4 2026
MLP Suraj Thapa, Hakim Touati Training pipelines, JAX inference Q4 2026
AE Irving Lin, Everton Lucas Constraint extraction, event-driven repricing Q4 2026
PA Prerana Devadhar, Woojong Yi Metrics, ICP, experiment analysis, override segmentation Q3 2026
PM Sujoy Guha Strategy, co-innovation, alignment Q4 2026

Opportunity cost: DS capacity not available for Group Pricing ML, MBRT expansion, or new Pricerator features. AE time competes with monolith migration. Deliberate tradeoff.

6. Non-Goals

  • Replacing revenue managers (high-risk always escalates)
  • Group pricing automation (separate initiative)
  • Cross-owner pooling (legal: SEE-only)
  • Premium tier GTM in Phase 0-1 (prove value first)
  • Real-time repricing in Phase 0-1 (batch stays, event-driven is Phase 2)

7. Pre-Mortem

Failure Mode Likelihood Mitigation
LP doesn't outperform legacy Medium Phase 0 kill at >5pp RA degradation
RL doesn't outperform LP Medium Phase 2 shadow. LP continues with override learning.
RMs reject autonomy Low-Medium Start with self-selected co-innovation, highest-trust hotels
Lighthouse ships enterprise features Medium Moat is depth (segments, restrictions, group). 12-18mo window.
AE capacity doesn't materialize Medium-High Phase 0-1 on DS+MLP alone. Phase 2 slips.
Override learning shows no lift Medium Focus on comp signals and demand responsiveness instead

8. Approval

  • [ ] Kartik Yellepeddi
  • [ ] Jon Ham (review pending)

9. Open Questions

ID Question Owner Resolution
Q1 What interventions map to each ICP segment? Sujoy / Prerana Pending ICP query formalization
Q2 Should segment boundaries be static or dynamic? Prerana Open
Q3 How to handle 86 unclassified hotels? Prerana Open
Q4 Eppo Redshift access for Woojong? DE team Blocked