ADR: Vapi.ai Integration Architecture¶
Date: January 2026 Status: Approved Decision: Dedicated agent per language with all use cases combined (not squad approach)
Context¶
VitaraPlatform v1.0 supports 3 use cases:
- New patient registration
- Appointment booking
- Appointment update/cancellation
Question: Should each use case have a separate specialized agent, or should one agent handle all use cases?
Decision¶
Use ONE agent per clinic that handles all three use cases. Each agent is multilingual (English + Mandarin).
Rejected: Squad of 9 specialized agents (3 use cases × 3 languages).
Architecture Comparison¶
Approved: Single Agent Per Clinic¶
+------------------------------------------------------------------+
| |
| Clinic Phone Number (+1-604-555-1234) |
| | |
| v |
| +------------------------------------------------------------------+
| | |
| | SINGLE MULTILINGUAL ASSISTANT |
| | |
| | Handles: |
| | - New patient registration |
| | - Appointment booking |
| | - Appointment update/cancellation |
| | |
| | Languages: English, Mandarin (auto-detect) |
| | |
| +------------------------------------------------------------------+
| |
| Total Assistants: 1 per clinic |
| Total for 5 pilot clinics: 5 assistants |
| |
+------------------------------------------------------------------+
Rejected: Squad of Specialized Agents¶
+------------------------------------------------------------------+
| |
| Clinic Phone Number (+1-604-555-1234) |
| | |
| v |
| IVR: "Press 1 for English, 2 for Chinese, 3 for French" |
| | |
| v |
| IVR: "Press 1 to register, 2 to book, 3 to update/cancel" |
| | |
| +----+----+----+----+----+----+----+----+ |
| v v v v v v v v v |
| +-------+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+
| | EN | | EN | | EN | | FR | | FR | | FR | | ZH | | ZH |
| | Reg | |Book| | Upd| |Reg | |Book| |Upd | |Reg | |Book|
| +-------+ +----+ +----+ +----+ +----+ +----+ +----+ +----+
| |
| Total Assistants: 9 per clinic |
| Total for 5 pilot clinics: 45 assistants |
| |
+------------------------------------------------------------------+
Rationale¶
1. Latency (Critical)¶
Target: <800ms end-to-end voice latency
| Approach | Latency Impact |
|---|---|
| Single agent | Optimal - no transfers |
| Squad approach | +200-500ms per agent transfer |
If a patient says "I want to book, but also update my contact info" during booking, the squad approach requires mid-call transfer, adding latency and risking context loss.
2. Configuration Complexity¶
v1.0 involves manual Vapi.ai configuration:
| Approach | Assistants | Setup Time |
|---|---|---|
| Single agent | 5 (5 clinics × 1) | 4-5 hours |
| Squad approach | 45 (5 clinics × 9) | 12-15 hours |
v1.0 pilot savings: ~$800-1000 in configuration time.
3. User Experience¶
Single Agent:
Agent: "Hello! I can help with registration, booking, or
updating appointments. What would you like to do?"
Patient: "I'd like to book an appointment"
Agent: "Great! Let me check availability..."
Patient: "Actually, can you also update my phone number?"
Agent: "Of course! What's your new phone number?"
Squad Approach:
IVR: "Press 1 for English..."
IVR: "Press 2 to book appointments..."
Booking Agent: "I'll help you book an appointment..."
Patient: "Can you also update my phone number?"
Agent: "Let me transfer you to our update agent..."
[200-500ms silence, context lost]
Update Agent: "Hello! What would you like to update?"
Patient: [repeats everything]
4. Technical Architecture¶
All 3 use cases share:
- Same OSCAR EMR API connection
- Same clinic business hours logic
- Same handoff phone number
- Same webhook handler
// Single webhook handles all use cases
switch (functionName) {
case 'register_patient':
return registerPatient(clinic, args);
case 'book_appointment':
return bookAppointment(clinic, args);
case 'update_appointment':
return updateAppointment(clinic, args);
}
No architectural benefit to separating use cases into different agents.
5. System Prompt Complexity¶
Modern LLMs (GPT-4) handle multi-intent prompts easily:
| Approach | Total Prompt Lines |
|---|---|
| Single agent | ~750 lines (all workflows) |
| Squad (3 agents) | ~650 lines (200+250+200) |
Marginal difference, but squad adds transfer logic complexity.
When Squad Approach Makes Sense¶
Squad of specialized agents is appropriate when:
| Criteria | VitaraVox v1.0 |
|---|---|
| Vastly different complexity | No - all similar |
| Different knowledge bases | No - same OSCAR data |
| Different LLM models | No - all GPT-4 |
| Compliance/legal separation | No |
| Distinct user personas | No - all patient-facing |
None of these criteria apply to v1.0.
Future Consideration (v2.0+)¶
If v2.0 adds complex use cases (prescription refills, clinical triage), consider hybrid:
+------------------------------------------------------------------+
| |
| Primary Agent (handles 80% of calls): |
| - Registration |
| - Booking |
| - Updates |
| |
| Escalation Agents (handles 20% complex calls): |
| - Prescription Refill Agent |
| - Clinical Triage Agent |
| |
+------------------------------------------------------------------+
Strategy: Primary agent attempts all, escalates only when needed.
Implementation Guidelines¶
System Prompt Structure¶
# VitaraPlatform Medical Receptionist
## Capabilities
1. Register new patients
2. Book appointments
3. Update/cancel appointments
## Workflow 1: Registration
[registration logic]
## Workflow 2: Booking
[booking logic]
## Workflow 3: Update/Cancel
[update logic]
Database Schema¶
-- Single assistant_id per clinic (not 9)
CREATE TABLE clinics (
id UUID PRIMARY KEY,
vapi_assistant_id VARCHAR(100), -- ONE assistant
vapi_phone_number VARCHAR(20),
...
);
Consequences¶
Positive:
- Optimal latency (no transfers)
- 9x fewer assistants to manage
- Better patient experience
- Simpler testing (9 scenarios vs 27+)
- Lower operational cost
Negative:
- Single prompt handles all complexity
- Harder to A/B test individual use cases
- All use cases share same configuration