Skip to content

Agent Prompts & Behaviors

Per-agent behavior details for the v3.0 dual-track bilingual squad

Last Updated: 2026-03-09 (v4.3.0 — SMS consent UX added to Patient-ID, Booking, Modification)


Overview

The v3.0 squad has 5 roles, each deployed in two language tracks (EN/ZH) plus a bilingual Router, for a total of 9 agents. This page documents the behavior, tools, and conversation design for each role.

Role Agents Primary Function
Router 1 (bilingual) Language detection (keyword-based) + routing
Patient-ID 2 (EN, ZH) Caller identification + intent detection
Booking 2 (EN, ZH) Find slots + book appointments
Modification 2 (EN, ZH) Reschedule + cancel + check appointments
Registration 2 (EN, ZH) New patient registration

Router (Language Gate)

Property Value
Agent vitara-router-v3 (4f70e214-...)
LLM GPT-4o, temperature 0.3
maxTokens 400 (P0 fix -- was 150)
STT AssemblyAI Universal (bilingual)
TTS ElevenLabs eleven_multilingual_v2
Tools get_clinic_info, transfer_call, log_call_metadata
firstMessage "Hi there, thanks for calling!" (hardcoded in squad YAML, plays before LLM runs)

What It Does

The Router is the entry point for every call. Its sole job is to detect the caller's language and route to the correct language track.

Behavior

  1. Before LLM runs: Vapi plays the hardcoded firstMessage: "Hi there, thanks for calling!" The Router prompt knows this has already been spoken and does NOT repeat it.
  2. First LLM turn (mandatory): Calls get_clinic_info to get clinic settings. The tool returns customGreeting, businessHours, isOpen, etc. Note: The tool does NOT return a clinicName field; the Router uses customGreeting or generates a greeting from context.
  3. After tool returns: Delivers a warm greeting, e.g., "Welcome to [Clinic]! How can I help you today?"
  4. Language detection (KEYWORD-BASED): Default is ENGLISH. Routes to Chinese ONLY if the caller explicitly says "Mandarin", "Chinese", "speak Chinese", "speak Mandarin", or Chinese text ("中文"). If the caller's words are garbled, the Router asks: "Would you like to continue in English, or Mandarin? 英文还是中文?"
  5. English detected: Brief phrase like "Sure!" or "Of course, happy to help!" then handoff to Patient-ID-EN via handoff_to_patient_id_en
  6. Chinese requested: "好的!" then handoff to Patient-ID-ZH via handoff_to_patient_id_zh
  7. Clinic info shortcut: If caller asks for hours/address, call get_clinic_info and answer directly without routing. After answering, ask "Anything else?" If no, call log_call_metadata with callOutcome = "clinic_info".
  8. Emergency: Detects keywords in both EN and ZH ("chest pain", "can't breathe", "胸痛", "喘不过气", etc.), responds in detected language with 911 instruction, ends call.
  9. Fallback: After 3 unclear attempts, transfer to staff via transfer_call with reason "out_of_scope".

P0 Fixes Applied

  • maxTokens 150 to 400: GPT-4o tool-call JSON consumes 80-120 tokens; 150 caused silent truncation
  • Dynamic greeting: Replaced hardcoded "Hi, this is Vitara Clinic" with get_clinic_info-driven greeting (though firstMessage in squad YAML is still hardcoded)
  • Warm acknowledgment: Replaced rigid "Say EXACTLY 'One moment please'" with natural phrasing ("Sure!", "Of course!")
  • Clinic-agnostic: All "Vitara" references removed from prompt

Patient-ID (EN / ZH)

Property Value
Agents vitara-patient-id-en-v3 (7d054785-...), vitara-patient-id-zh-v3 (7585c092-...)
LLM GPT-4o, temperature 0.5
maxTokens 200
STT Deepgram nova-2 en / zh
TTS ElevenLabs (EN) / Azure XiaoxiaoNeural (ZH)
Tools search_patient_by_phone, search_patient, get_clinic_info, transfer_call
firstMessage NONE (removed in P1 fix -- was causing 16s silence)

What It Does

Patient-ID identifies the caller using their phone number, detects their intent from the conversation history, and routes to the appropriate specialist agent.

Behavior

  1. On handoff from Router: The agent says a brief phrase alongside the first tool call -- EN: "One moment while I look you up", ZH: "我帮您查一下". This is NOT zero text; the phrase is spoken alongside the tool invocation.
  2. First tool call (MANDATORY): Calls search_patient_by_phone with phone: "0000000000". The LLM sends this dummy value; the server substitutes the real caller phone from call.customer.number (Telnyx metadata).
  3. WAIT for result: The prompt explicitly instructs "WAIT for the actual search_patient_by_phone result before speaking about the patient." This prevents hallucinated patient names.
  4. Found + identity confirmed + intent known: Route immediately to the appropriate agent (no "how can I help")
  5. Found + identity confirmed + intent unknown: "How can I help you today?" Wait for response, detect intent, then route.
  6. Not found: "I'm not finding a file under this phone number. Are you a new patient?" YES -> handoff_to_registration_en/zh. NO -> ask name+DOB, call search_patient.
  7. On behalf of: Supports "calling for my husband/child" flow -- pivots to search_patient with name + DOB
  8. SMS consent disclosure: After identifying the patient, delivers a recording + SMS notice: "This call is recorded for quality and scheduling purposes. We may also send you a text confirmation for any appointments. Let me know if you'd rather not receive texts." If patient declines → notes smsDeclined = true in conversation context (passes smsConsent = false on future tool calls). If patient says nothing or agrees → default consent (smsConsent = true). Source: patient-id-en.md:119-121, patient-id-zh.md:116-117.
  9. Routes to: Booking, Modification, Registration, or answers clinic info directly via get_clinic_info

P0/P1 Fixes Applied

  • firstMessage REMOVED: Was causing 16 seconds of silence on Patient-ID squad members
  • Steps 1+2 merged: Combined greeting + tool call into a single first-turn action
  • Defensive tool-result instruction: "WAIT for actual tool result before speaking about the patient" -- prevents hallucinated names
  • Clinic-agnostic: All "Vitara" references removed
  • Handoff tool names: Changed from transferAssistant to handoff_to_booking_en, handoff_to_modification_en, etc.

Intent Detection

Patient-ID analyzes the conversation history (including what the caller said to the Router) to detect intent:

Caller Says (EN) Caller Says (ZH) Detected Intent Routes To
"book", "appointment", "schedule", "see a doctor", "prescription refill" "预约", "挂号", "看病", "看医生", "配药" BOOK Booking
"reschedule", "change my appointment", "move", "different time" "改时间", "改约", "换个时间" RESCHEDULE Modification
"cancel", "remove" "取消", "不去了" CANCEL Modification
"check my appointment", "when is my appointment" "查一下预约", "我什么时候的预约" CHECK Modification
"new patient", "register", "first time" "新患者", "注册", "第一次来" REGISTER Registration
"hours", "location", "address" "营业时间", "地址" CLINIC_INFO Answer directly
Just "Hi" / no clear intent Unclear UNKNOWN Asks "How can I help?"

Routing Phrases

Intent EN Phrase ZH Phrase
BOOK "I'll get you set up." "好的!"
RESCHEDULE/CANCEL "I can help with that." "好的。" / "好的,我来帮您处理。"
CHECK "Let me pull that up." "我帮您查一下。"
REGISTER (no phrase, silent handoff) (no phrase)

Booking (EN / ZH)

Property Value
Agents vitara-booking-en-v3 (ac25775b-...), vitara-booking-zh-v3 (6ef04a40-...)
LLM GPT-4o, temperature 0.5
maxTokens 200
STT Deepgram nova-2 en / zh
TTS ElevenLabs (EN) / Azure XiaoxiaoNeural (ZH)
Tools find_earliest_appointment, check_appointments, create_appointment, get_providers, log_call_metadata

What It Does

Booking finds available appointment slots and books them. It presents one slot at a time and adjusts based on caller preferences.

Behavior

  1. First turn: Says a brief phrase alongside the tool call -- EN: "Let me find you an appointment", ZH: "我帮您查一下可以的时间". Then calls find_earliest_appointment with NO parameters. The tool-level request-start message ("Let me check what's available.") also plays.
  2. Presents: One slot at a time -- EN: "I have [day], [date] at [time] with Dr. [name]. Does that work?" ZH: "最近有[日期] 星期[几] [时间],[医生]医生的号,您看可以吗?"
  3. Rejected: Asks for preference (different day, time, doctor), calls find_earliest_appointment with filters
  4. Different doctor: Calls get_providers to see all available doctors, then find_earliest_appointment with a DIFFERENT providerName
  5. Asks reason: Maps to appointmentType (B=general, 2=follow-up, 3=complaint, P=prescription)
  6. Books: Calls create_appointment with demographicId, providerId, startTime, appointmentType, reason, language
  7. Never says "booked" without calling the tool first and receiving a success response
  8. Post-booking: Confirms details (date/time/provider), calls log_call_metadata, reminds about health card
  9. SMS confirmation: Checks smsSent in the create_appointment response. If smsSent = true → adds "You'll get a text confirmation shortly." If smsSent = false or patient declined texts → omits any mention of texts. The smsConsent parameter is passed on create_appointment calls (true unless patient declined during Patient-ID phase). Source: booking-en.md:150-162.
  10. Wrong intent: Redirects to Modification via handoff_to_modification_en/zh or Router via handoff_to_router_v3

P1/P2 Fixes Applied

  • FILLER PHRASE RULES deleted: Tool-level request-start messages replace LLM filler
  • Handoff tool names: Changed from transferAssistant to handoff_to_X

Known Gap

transfer_call is NOT currently assigned to Booking agents. Frustrated callers cannot be transferred directly to staff from the Booking flow.


Modification (EN / ZH)

Property Value
Agents vitara-modification-en-v3 (9cd8381d-...), vitara-modification-zh-v3 (e348cd2f-...)
LLM GPT-4o, temperature 0.5
maxTokens 200
STT Deepgram nova-2 en / zh
TTS ElevenLabs (EN) / Azure XiaoxiaoNeural (ZH)
Tools check_appointments, find_earliest_appointment, update_appointment, cancel_appointment, create_appointment, get_providers, log_call_metadata, transfer_call

What It Does

Modification consolidates reschedule, cancel, and check-appointment flows into a single agent. This reduces squad complexity compared to v2.3.0's separate Reschedule and Cancel agents.

Behavior

  1. First turn: Says a brief phrase alongside the tool call -- EN: "Let me pull up your appointments", ZH: "我帮您查一下您的预约". Then calls check_appointments with demographicId, startDate = today, endDate = 6 months out, findAvailable = false. Tool-level request-start plays: "Let me look that up."
  2. Multiple found: Lists first 3 briefly (date + doctor), asks which one
  3. Reschedule: Calls find_earliest_appointment for new slot, then update_appointment
  4. Cancel: Confirms with patient ("Just to confirm -- cancel your [date] appointment?"), then calls cancel_appointment, offers rebooking
  5. Check only: Reads appointment details, asks if changes needed
  6. Never says "moved" or "cancelled" without calling the tool first and receiving a success response
  7. Post-action: Calls log_call_metadata with outcome
  8. SMS confirmation: Passes smsConsent on update_appointment and cancel_appointment calls. Checks smsSent in response — if true, adds "You'll get a text confirmation shortly." If patient changed their mind about texts during the call, uses their most recent preference. Source: modification-en.md:99-124.
  9. Wrong intent: Routes to Booking via handoff_to_booking_en/zh or Router via handoff_to_router_v3
  10. Error fallback: Can transfer to staff via transfer_call (this agent HAS the tool)

P1/P2 Fixes Applied

  • FILLER PHRASE RULES deleted: Tool-level request-start messages replace LLM filler
  • Handoff tool names: Changed from transferAssistant to handoff_to_X

Registration (EN / ZH)

Property Value
Agents vitara-registration-en-v3 (9fcfd00d-...), vitara-registration-zh-v3 (ce50df43-...)
LLM GPT-4o, temperature 0.5
maxTokens 250
STT Deepgram nova-2 en / zh (with longer endpointing for spelling)
TTS ElevenLabs (EN) / Azure XiaoxiaoNeural (ZH)
Tools register_new_patient, add_to_waitlist, log_call_metadata

What It Does

Registration collects patient information and creates a new patient record. It collects 7 fields one at a time in a conversational manner, then confirms before submitting.

Behavior

  1. On handoff: Welcomes the caller and explains the process. Includes recording disclosure (PIPEDA/PIPA compliance). EN: "Welcome! I'll help you register. This takes a few minutes. Just so you know, this call is recorded for quality and scheduling purposes. By continuing, you consent to the recording."
  2. Collects 7 fields one at a time:
    • Full name (first and last)
    • Gender (male, female, other)
    • Date of birth
    • Phone number
    • Address (with city and postal code)
    • Health card (BC PHN, out-of-province, or private)
    • Email (optional -- "or we can skip this")
  3. Spelling: Uses NATO-style phonetic confirmation ("A as in Apple")
  4. Silent during spelling: Does NOT acknowledge individual letters
  5. Health card types: BC (10-digit PHN), OUT_OF_PROVINCE, PRIVATE
  6. Confirms all before calling register_new_patient -- abbreviated read-back (name, DOB, phone, address)
  7. Post-registration: Offers first appointment booking -- routes to Booking via handoff_to_booking_en/zh
  8. Clinic not accepting: Offers waitlist via add_to_waitlist

Registration Still Has FILLER PHRASE RULES

Unlike Booking and Modification, the Registration EN/ZH prompts still contain a FILLER PHRASE RULES section. This instructs the agent to say one brief natural phrase before tool calls and vary the phrasing. This was intentionally retained because the registration flow has more varied tool-call patterns where tool-level request-start alone is insufficient.

Known Gaps

  • transfer_call is NOT currently assigned to Registration agents. Frustrated callers cannot be transferred directly to staff.
  • handoff_to_router_v3 is NOT configured in the Registration squad YAML. Registration agents cannot route back to the Router.

Endpointing Configuration

Registration uses longer endpointing pauses than other agents to accommodate name spelling:

Setting EN Registration ZH Registration EN Standard ZH Standard
Wait Seconds 1.6 1.6 0.6 1.0
On Punctuation 1.0s 1.0s 0.3s 0.6s
On No Punctuation 2.5s 2.5s 0.8s 1.5s
On Number 1.2s 1.2s 0.5s 0.8s

Global Behaviors (All Agents)

These behaviors apply to all 9 agents in the squad:

Emergency Detection

Both EN and ZH keywords trigger an immediate 911 response in the caller's detected language. The agent does not ask clarifying questions, does not look up the patient, and does not mention the clinic. One message, then end call.

EN: "This sounds like a medical emergency. Please hang up and call 911 immediately."

ZH: "这听起来是紧急情况。请您立即挂断电话并拨打911。"

Silent Transfers

No agent ever mentions "transferring", "assistant", "system", or any internal terminology. From the caller's perspective, the conversation flows naturally with brief pauses between agents.

Date Awareness

All prompts include {{now | date: ...}} Liquid template for current date/time. The server additionally clamps any past dates to today.

Phone Number Substitution

All prompts instruct the LLM to call search_patient_by_phone with phone: "0000000000". The server ignores this value and substitutes the real caller phone from call.customer.number (Telnyx metadata). The LLM never has access to the real phone number.

One Question at a Time

Agents never ask two questions in the same turn. Each turn is one thought + one question (or one result + one question).

Technical Language Ban

Agents never say "function", "tool", "API", "database", "system", "server", "webhook", or any technical term. From the caller's perspective, they are talking to a receptionist, not a computer.