Agent Prompts & Behaviors¶
Per-agent behavior details for the v3.0 dual-track bilingual squad
Last Updated: 2026-03-09 (v4.3.0 — SMS consent UX added to Patient-ID, Booking, Modification)
Overview¶
The v3.0 squad has 5 roles, each deployed in two language tracks (EN/ZH) plus a bilingual Router, for a total of 9 agents. This page documents the behavior, tools, and conversation design for each role.
| Role | Agents | Primary Function |
|---|---|---|
| Router | 1 (bilingual) | Language detection (keyword-based) + routing |
| Patient-ID | 2 (EN, ZH) | Caller identification + intent detection |
| Booking | 2 (EN, ZH) | Find slots + book appointments |
| Modification | 2 (EN, ZH) | Reschedule + cancel + check appointments |
| Registration | 2 (EN, ZH) | New patient registration |
Router (Language Gate)¶
| Property | Value |
|---|---|
| Agent | vitara-router-v3 (4f70e214-...) |
| LLM | GPT-4o, temperature 0.3 |
| maxTokens | 400 (P0 fix -- was 150) |
| STT | AssemblyAI Universal (bilingual) |
| TTS | ElevenLabs eleven_multilingual_v2 |
| Tools | get_clinic_info, transfer_call, log_call_metadata |
| firstMessage | "Hi there, thanks for calling!" (hardcoded in squad YAML, plays before LLM runs) |
What It Does¶
The Router is the entry point for every call. Its sole job is to detect the caller's language and route to the correct language track.
Behavior¶
- Before LLM runs: Vapi plays the hardcoded
firstMessage: "Hi there, thanks for calling!" The Router prompt knows this has already been spoken and does NOT repeat it. - First LLM turn (mandatory): Calls
get_clinic_infoto get clinic settings. The tool returnscustomGreeting,businessHours,isOpen, etc. Note: The tool does NOT return aclinicNamefield; the Router usescustomGreetingor generates a greeting from context. - After tool returns: Delivers a warm greeting, e.g., "Welcome to [Clinic]! How can I help you today?"
- Language detection (KEYWORD-BASED): Default is ENGLISH. Routes to Chinese ONLY if the caller explicitly says "Mandarin", "Chinese", "speak Chinese", "speak Mandarin", or Chinese text ("中文"). If the caller's words are garbled, the Router asks: "Would you like to continue in English, or Mandarin? 英文还是中文?"
- English detected: Brief phrase like "Sure!" or "Of course, happy to help!" then handoff to Patient-ID-EN via
handoff_to_patient_id_en - Chinese requested: "好的!" then handoff to Patient-ID-ZH via
handoff_to_patient_id_zh - Clinic info shortcut: If caller asks for hours/address, call
get_clinic_infoand answer directly without routing. After answering, ask "Anything else?" If no, calllog_call_metadatawithcallOutcome = "clinic_info". - Emergency: Detects keywords in both EN and ZH ("chest pain", "can't breathe", "胸痛", "喘不过气", etc.), responds in detected language with 911 instruction, ends call.
- Fallback: After 3 unclear attempts, transfer to staff via
transfer_callwith reason "out_of_scope".
P0 Fixes Applied
- maxTokens 150 to 400: GPT-4o tool-call JSON consumes 80-120 tokens; 150 caused silent truncation
- Dynamic greeting: Replaced hardcoded "Hi, this is Vitara Clinic" with
get_clinic_info-driven greeting (thoughfirstMessagein squad YAML is still hardcoded) - Warm acknowledgment: Replaced rigid "Say EXACTLY 'One moment please'" with natural phrasing ("Sure!", "Of course!")
- Clinic-agnostic: All "Vitara" references removed from prompt
Patient-ID (EN / ZH)¶
| Property | Value |
|---|---|
| Agents | vitara-patient-id-en-v3 (7d054785-...), vitara-patient-id-zh-v3 (7585c092-...) |
| LLM | GPT-4o, temperature 0.5 |
| maxTokens | 200 |
| STT | Deepgram nova-2 en / zh |
| TTS | ElevenLabs (EN) / Azure XiaoxiaoNeural (ZH) |
| Tools | search_patient_by_phone, search_patient, get_clinic_info, transfer_call |
| firstMessage | NONE (removed in P1 fix -- was causing 16s silence) |
What It Does¶
Patient-ID identifies the caller using their phone number, detects their intent from the conversation history, and routes to the appropriate specialist agent.
Behavior¶
- On handoff from Router: The agent says a brief phrase alongside the first tool call -- EN: "One moment while I look you up", ZH: "我帮您查一下". This is NOT zero text; the phrase is spoken alongside the tool invocation.
- First tool call (MANDATORY): Calls
search_patient_by_phonewithphone: "0000000000". The LLM sends this dummy value; the server substitutes the real caller phone fromcall.customer.number(Telnyx metadata). - WAIT for result: The prompt explicitly instructs "WAIT for the actual
search_patient_by_phoneresult before speaking about the patient." This prevents hallucinated patient names. - Found + identity confirmed + intent known: Route immediately to the appropriate agent (no "how can I help")
- Found + identity confirmed + intent unknown: "How can I help you today?" Wait for response, detect intent, then route.
- Not found: "I'm not finding a file under this phone number. Are you a new patient?" YES ->
handoff_to_registration_en/zh. NO -> ask name+DOB, callsearch_patient. - On behalf of: Supports "calling for my husband/child" flow -- pivots to
search_patientwith name + DOB - SMS consent disclosure: After identifying the patient, delivers a recording + SMS notice: "This call is recorded for quality and scheduling purposes. We may also send you a text confirmation for any appointments. Let me know if you'd rather not receive texts." If patient declines → notes
smsDeclined = truein conversation context (passessmsConsent = falseon future tool calls). If patient says nothing or agrees → default consent (smsConsent = true). Source:patient-id-en.md:119-121,patient-id-zh.md:116-117. - Routes to: Booking, Modification, Registration, or answers clinic info directly via
get_clinic_info
P0/P1 Fixes Applied
- firstMessage REMOVED: Was causing 16 seconds of silence on Patient-ID squad members
- Steps 1+2 merged: Combined greeting + tool call into a single first-turn action
- Defensive tool-result instruction: "WAIT for actual tool result before speaking about the patient" -- prevents hallucinated names
- Clinic-agnostic: All "Vitara" references removed
- Handoff tool names: Changed from
transferAssistanttohandoff_to_booking_en,handoff_to_modification_en, etc.
Intent Detection¶
Patient-ID analyzes the conversation history (including what the caller said to the Router) to detect intent:
| Caller Says (EN) | Caller Says (ZH) | Detected Intent | Routes To |
|---|---|---|---|
| "book", "appointment", "schedule", "see a doctor", "prescription refill" | "预约", "挂号", "看病", "看医生", "配药" | BOOK | Booking |
| "reschedule", "change my appointment", "move", "different time" | "改时间", "改约", "换个时间" | RESCHEDULE | Modification |
| "cancel", "remove" | "取消", "不去了" | CANCEL | Modification |
| "check my appointment", "when is my appointment" | "查一下预约", "我什么时候的预约" | CHECK | Modification |
| "new patient", "register", "first time" | "新患者", "注册", "第一次来" | REGISTER | Registration |
| "hours", "location", "address" | "营业时间", "地址" | CLINIC_INFO | Answer directly |
| Just "Hi" / no clear intent | Unclear | UNKNOWN | Asks "How can I help?" |
Routing Phrases¶
| Intent | EN Phrase | ZH Phrase |
|---|---|---|
| BOOK | "I'll get you set up." | "好的!" |
| RESCHEDULE/CANCEL | "I can help with that." | "好的。" / "好的,我来帮您处理。" |
| CHECK | "Let me pull that up." | "我帮您查一下。" |
| REGISTER | (no phrase, silent handoff) | (no phrase) |
Booking (EN / ZH)¶
| Property | Value |
|---|---|
| Agents | vitara-booking-en-v3 (ac25775b-...), vitara-booking-zh-v3 (6ef04a40-...) |
| LLM | GPT-4o, temperature 0.5 |
| maxTokens | 200 |
| STT | Deepgram nova-2 en / zh |
| TTS | ElevenLabs (EN) / Azure XiaoxiaoNeural (ZH) |
| Tools | find_earliest_appointment, check_appointments, create_appointment, get_providers, log_call_metadata |
What It Does¶
Booking finds available appointment slots and books them. It presents one slot at a time and adjusts based on caller preferences.
Behavior¶
- First turn: Says a brief phrase alongside the tool call -- EN: "Let me find you an appointment", ZH: "我帮您查一下可以的时间". Then calls
find_earliest_appointmentwith NO parameters. The tool-levelrequest-startmessage ("Let me check what's available.") also plays. - Presents: One slot at a time -- EN: "I have [day], [date] at [time] with Dr. [name]. Does that work?" ZH: "最近有[日期] 星期[几] [时间],[医生]医生的号,您看可以吗?"
- Rejected: Asks for preference (different day, time, doctor), calls
find_earliest_appointmentwith filters - Different doctor: Calls
get_providersto see all available doctors, thenfind_earliest_appointmentwith a DIFFERENTproviderName - Asks reason: Maps to appointmentType (
B=general,2=follow-up,3=complaint,P=prescription) - Books: Calls
create_appointmentwithdemographicId,providerId,startTime,appointmentType,reason,language - Never says "booked" without calling the tool first and receiving a success response
- Post-booking: Confirms details (date/time/provider), calls
log_call_metadata, reminds about health card - SMS confirmation: Checks
smsSentin thecreate_appointmentresponse. IfsmsSent = true→ adds "You'll get a text confirmation shortly." IfsmsSent = falseor patient declined texts → omits any mention of texts. ThesmsConsentparameter is passed oncreate_appointmentcalls (trueunless patient declined during Patient-ID phase). Source:booking-en.md:150-162. - Wrong intent: Redirects to Modification via
handoff_to_modification_en/zhor Router viahandoff_to_router_v3
P1/P2 Fixes Applied
- FILLER PHRASE RULES deleted: Tool-level
request-startmessages replace LLM filler - Handoff tool names: Changed from
transferAssistanttohandoff_to_X
Known Gap
transfer_call is NOT currently assigned to Booking agents. Frustrated callers cannot be transferred directly to staff from the Booking flow.
Modification (EN / ZH)¶
| Property | Value |
|---|---|
| Agents | vitara-modification-en-v3 (9cd8381d-...), vitara-modification-zh-v3 (e348cd2f-...) |
| LLM | GPT-4o, temperature 0.5 |
| maxTokens | 200 |
| STT | Deepgram nova-2 en / zh |
| TTS | ElevenLabs (EN) / Azure XiaoxiaoNeural (ZH) |
| Tools | check_appointments, find_earliest_appointment, update_appointment, cancel_appointment, create_appointment, get_providers, log_call_metadata, transfer_call |
What It Does¶
Modification consolidates reschedule, cancel, and check-appointment flows into a single agent. This reduces squad complexity compared to v2.3.0's separate Reschedule and Cancel agents.
Behavior¶
- First turn: Says a brief phrase alongside the tool call -- EN: "Let me pull up your appointments", ZH: "我帮您查一下您的预约". Then calls
check_appointmentswithdemographicId,startDate = today,endDate = 6 months out,findAvailable = false. Tool-levelrequest-startplays: "Let me look that up." - Multiple found: Lists first 3 briefly (date + doctor), asks which one
- Reschedule: Calls
find_earliest_appointmentfor new slot, thenupdate_appointment - Cancel: Confirms with patient ("Just to confirm -- cancel your [date] appointment?"), then calls
cancel_appointment, offers rebooking - Check only: Reads appointment details, asks if changes needed
- Never says "moved" or "cancelled" without calling the tool first and receiving a success response
- Post-action: Calls
log_call_metadatawith outcome - SMS confirmation: Passes
smsConsentonupdate_appointmentandcancel_appointmentcalls. CheckssmsSentin response — iftrue, adds "You'll get a text confirmation shortly." If patient changed their mind about texts during the call, uses their most recent preference. Source:modification-en.md:99-124. - Wrong intent: Routes to Booking via
handoff_to_booking_en/zhor Router viahandoff_to_router_v3 - Error fallback: Can transfer to staff via
transfer_call(this agent HAS the tool)
P1/P2 Fixes Applied
- FILLER PHRASE RULES deleted: Tool-level
request-startmessages replace LLM filler - Handoff tool names: Changed from
transferAssistanttohandoff_to_X
Registration (EN / ZH)¶
| Property | Value |
|---|---|
| Agents | vitara-registration-en-v3 (9fcfd00d-...), vitara-registration-zh-v3 (ce50df43-...) |
| LLM | GPT-4o, temperature 0.5 |
| maxTokens | 250 |
| STT | Deepgram nova-2 en / zh (with longer endpointing for spelling) |
| TTS | ElevenLabs (EN) / Azure XiaoxiaoNeural (ZH) |
| Tools | register_new_patient, add_to_waitlist, log_call_metadata |
What It Does¶
Registration collects patient information and creates a new patient record. It collects 7 fields one at a time in a conversational manner, then confirms before submitting.
Behavior¶
- On handoff: Welcomes the caller and explains the process. Includes recording disclosure (PIPEDA/PIPA compliance). EN: "Welcome! I'll help you register. This takes a few minutes. Just so you know, this call is recorded for quality and scheduling purposes. By continuing, you consent to the recording."
- Collects 7 fields one at a time:
- Full name (first and last)
- Gender (male, female, other)
- Date of birth
- Phone number
- Address (with city and postal code)
- Health card (BC PHN, out-of-province, or private)
- Email (optional -- "or we can skip this")
- Spelling: Uses NATO-style phonetic confirmation ("A as in Apple")
- Silent during spelling: Does NOT acknowledge individual letters
- Health card types: BC (10-digit PHN), OUT_OF_PROVINCE, PRIVATE
- Confirms all before calling
register_new_patient-- abbreviated read-back (name, DOB, phone, address) - Post-registration: Offers first appointment booking -- routes to Booking via
handoff_to_booking_en/zh - Clinic not accepting: Offers waitlist via
add_to_waitlist
Registration Still Has FILLER PHRASE RULES
Unlike Booking and Modification, the Registration EN/ZH prompts still contain a FILLER PHRASE RULES section. This instructs the agent to say one brief natural phrase before tool calls and vary the phrasing. This was intentionally retained because the registration flow has more varied tool-call patterns where tool-level request-start alone is insufficient.
Known Gaps
transfer_callis NOT currently assigned to Registration agents. Frustrated callers cannot be transferred directly to staff.handoff_to_router_v3is NOT configured in the Registration squad YAML. Registration agents cannot route back to the Router.
Endpointing Configuration¶
Registration uses longer endpointing pauses than other agents to accommodate name spelling:
| Setting | EN Registration | ZH Registration | EN Standard | ZH Standard |
|---|---|---|---|---|
| Wait Seconds | 1.6 | 1.6 | 0.6 | 1.0 |
| On Punctuation | 1.0s | 1.0s | 0.3s | 0.6s |
| On No Punctuation | 2.5s | 2.5s | 0.8s | 1.5s |
| On Number | 1.2s | 1.2s | 0.5s | 0.8s |
Global Behaviors (All Agents)¶
These behaviors apply to all 9 agents in the squad:
Emergency Detection¶
Both EN and ZH keywords trigger an immediate 911 response in the caller's detected language. The agent does not ask clarifying questions, does not look up the patient, and does not mention the clinic. One message, then end call.
EN: "This sounds like a medical emergency. Please hang up and call 911 immediately."
ZH: "这听起来是紧急情况。请您立即挂断电话并拨打911。"
Silent Transfers¶
No agent ever mentions "transferring", "assistant", "system", or any internal terminology. From the caller's perspective, the conversation flows naturally with brief pauses between agents.
Date Awareness¶
All prompts include {{now | date: ...}} Liquid template for current date/time. The server additionally clamps any past dates to today.
Phone Number Substitution¶
All prompts instruct the LLM to call search_patient_by_phone with phone: "0000000000". The server ignores this value and substitutes the real caller phone from call.customer.number (Telnyx metadata). The LLM never has access to the real phone number.
One Question at a Time¶
Agents never ask two questions in the same turn. Each turn is one thought + one question (or one result + one question).
Technical Language Ban¶
Agents never say "function", "tool", "API", "database", "system", "server", "webhook", or any technical term. From the caller's perspective, they are talking to a receptionist, not a computer.