Skip to content

First Clinic Launch Plan — March 30, 2026

4-Week Execution Roadmap

Created: February 18, 2026


Constraints

  • Deadline: March 30, 2026 (hard)
  • Target: Single clinic deployment (English primary, Mandarin secondary)
  • Infrastructure: Current OCI ARM instance — no cloud migration
  • Automation: Claude Code available for code changes and deployments
  • Non-goals: Redis, ECS/Fargate, LiteLLM, LiveKit, horizontal scaling

What Must Be True on March 30

  1. Real patient data is protected (rotated secrets, encrypted credentials, idempotent operations)
  2. English booking/rescheduling/cancellation calls work reliably end-to-end
  3. Mandarin track is either validated or explicitly disabled for launch
  4. One clinic is fully onboarded (all 7 required pre-launch checks pass)
  5. Operational runbook exists (what to do when things break at 2am)
  6. Backup is tested (restore verified, not just dump verified)

Week 1 (Feb 19-25): Security Hardening + Stability

Theme: Make the system safe for real patient data.

Day 1-2: Secret Rotation & Environment Hardening

Task Advisory Ref Detail
Rotate JWT_SECRET Security Finding #1 Generate 64-char random hex: openssl rand -hex 32. Update .env AND ecosystem.config.cjs
Rotate JWT_REFRESH_SECRET Security Finding #1 Same — separate 64-char random hex
Rotate DATABASE_URL password Security Finding #1 ALTER USER vitara WITH PASSWORD '...'; then update .env
Fix CORS_ORIGIN in .env Security Finding #1 Change to https://dev.vitaravox.ca (match ecosystem.config.cjs)
Verify OSCAR_SOAP_PASSWORD Security Finding #1 Confirm this is the real clinic credential, not a test default
Verify ENCRYPTION_KEY is real Security Finding #6 Already set (64-char hex) — confirm it encrypts/decrypts correctly

Validation: Restart PM2, confirm admin dashboard login works, confirm Vapi webhooks authenticate, confirm OSCAR SOAP connects.

Day 3-4: Idempotency + Webhook Audit Trail

Task Advisory Ref Detail
Add toolCallId dedup Security Finding #10 Create processed_tool_calls table (toolCallId TEXT PK, result JSONB, created_at TIMESTAMPTZ, expires_at). Check before processing, cache result, return cached on retry. 24h TTL with daily cleanup.
Add webhook tool-call audit logging Security Finding #19 On every tool call: write to audit_logs table with action: 'vapi_tool_call', tool name, clinicId, callId, demographicId (if available), outcome (success/error). Use existing audit infrastructure.
-- Migration: add processed_tool_calls
CREATE TABLE processed_tool_calls (
  tool_call_id TEXT PRIMARY KEY,
  result JSONB NOT NULL,
  clinic_id TEXT,
  tool_name TEXT,
  created_at TIMESTAMPTZ DEFAULT NOW(),
  expires_at TIMESTAMPTZ DEFAULT NOW() + INTERVAL '24 hours'
);
CREATE INDEX idx_ptc_expires ON processed_tool_calls(expires_at);

Validation: Make two identical test calls with same toolCallId — second should return cached result. Check audit_logs table for tool call entries.

Day 5: PM2 Stability Investigation

Task Advisory Ref Detail
Investigate 6666 restart count Infra Finding #1 pm2 reset vitara-admin-api to zero the counter. Monitor for 24h. If restarts accumulate, check PM2 error logs for crash patterns.
Add PM2 restart delay Infra recommendation Add restart_delay: 5000 and max_restarts: 50 to ecosystem.config.cjs
Deploy from compiled JS Best practice Switch from tsx src/index.ts to node dist/index.js in ecosystem.config — faster startup, lower memory
// ecosystem.config.cjs — updated
module.exports = {
  apps: [{
    name: 'vitara-admin-api',
    script: 'dist/index.js',                    // compiled, not tsx
    cwd: '/home/ubuntu/vitara-platform/admin-dashboard/server',
    env: {
      NODE_ENV: 'production',
      PORT: 3002,
      CORS_ORIGIN: 'https://dev.vitaravox.ca'
    },
    watch: false,
    max_memory_restart: '500M',
    restart_delay: 5000,
    max_restarts: 50,
    kill_timeout: 10000                          // match graceful shutdown timeout
  }]
};

Build + deploy:

cd /home/ubuntu/vitara-platform/admin-dashboard/server
npx tsc
pm2 delete vitara-admin-api
pm2 start ecosystem.config.cjs
pm2 save

Validation: pm2 list shows 0 restarts. Monitor for 48h over weekend.


Week 2 (Feb 26 - Mar 4): Voice Quality + Mandarin Decision

Theme: Make calls sound professional. Decide Mandarin fate.

Day 1-2: P1 Prompt Fixes (All 5 Items)

Task P1 Ref Detail
Restore CONVERSATION STYLE to all 8 non-Router prompts P1 #1 Add back warm, professional tone guidance. Not filler phrases — just style: "Be warm, concise, and professional. Use natural transitions."
Add slot collision check to Booking EN/ZH prompts P1 #2 Add instruction: "Before calling create_appointment, confirm the patient doesn't already have an appointment that day by checking the tool result." Server already prevents double-booking, but LLM should warn the patient.
Add transfer_call tool to Booking + Registration assistants P1 #4 Add transfer-call-d95ed81e to toolIds in booking-en.md, booking-zh.md, registration-en.md, registration-zh.md YAML frontmatter
Add handoff_to_router_v3 to Registration EN/ZH in squad YAML P1 #5 Add handoff tool to Registration squad members in vitaravox-v3.yml so patients can escape registration flow
Warm SOAP clients on PM2 startup P1 #3 Add warmSoapClients() call in index.ts after server.listen() — pre-fetch WSDL for Schedule, Demographic, Provider services

SOAP warmup implementation:

// In index.ts, after server.listen():
async function warmSoapClients() {
  try {
    const factory = EmrAdapterFactory.getInstance();
    // Warm the adapter for the launch clinic (clinicId from config)
    const clinicId = process.env.LAUNCH_CLINIC_ID;
    if (clinicId) {
      logger.info({ clinicId }, 'Warming SOAP clients for launch clinic...');
      const adapter = await factory.getAdapter(clinicId);
      if (adapter && 'warmClients' in adapter) {
        await (adapter as any).warmClients();
      }
      logger.info({ clinicId }, 'SOAP clients warmed successfully');
    }
  } catch (err) {
    logger.warn({ err }, 'SOAP client warmup failed (non-fatal)');
  }
}

Push prompts to Vapi:

cd /home/ubuntu/vitara-platform/vapi-gitops
npm run push:dev

Validation: Make 3 test calls (booking, reschedule, registration). Verify natural conversation tone. Verify first call of day doesn't have long delay.

Day 3-4: Mandarin Testing Sprint

This is a focused 2-day test to make a go/no-go decision.

Test matrix (8 scenarios):

# Scenario Pass Criteria
1 Call, say "中文" to trigger ZH track Router detects and hands off to Patient-ID-ZH within 3s
2 Chinese caller identifies as existing patient Phone lookup succeeds, name confirmed in Chinese
3 Chinese caller books appointment Slot found, time communicated in Chinese, booking confirmed
4 Chinese caller reschedules Existing appointment listed, new slot found, reschedule confirmed
5 Chinese caller registers as new patient Name, DOB, phone collected in Chinese, registered
6 Chinese caller says English name (e.g., "John") TTS doesn't mangle the English name
7 Chinese caller triggers emergency keywords (胸痛) 911 redirect fires
8 Chinese caller requests human (转人工) transfer_call fires

Scoring:

  • 7-8 pass → Mandarin launches with English
  • 5-6 pass → Mandarin launches with documented limitations
  • <5 pass → Mandarin disabled for launch. Router prompt updated: "We currently support English only. Mandarin support coming soon."

How to disable Mandarin if needed:

# In router-v3.md prompt, replace language detection section with:
## LANGUAGE
Currently English only. If the caller speaks Mandarin or requests Chinese:
Say: "I'm sorry, we currently support English only.
      For Mandarin assistance, please call the clinic directly at [clinic phone]."
Do NOT route to Chinese track agents.

This is a business decision, not a technical failure. A buggy Mandarin experience is worse than no Mandarin. Better to launch English-only and add Mandarin in a v3.1 patch.

Day 5: Fix Whatever Mandarin Testing Reveals

Reserve this day for fixing issues found in Mandarin testing. Common issues from memory:

  • GPT-4o space-separated Chinese characters → monitor, may need prompt instruction "Never add spaces between Chinese characters"
  • Azure TTS pronunciation of English names in Chinese context → may need phonetic hints
  • Deepgram nova-2 ZH transcription accuracy → if poor, consider switching to AssemblyAI Universal for ZH track

Week 3 (Mar 5-11): Clinic Onboarding + Operational Readiness

Theme: Configure the actual clinic. Build the safety net.

Day 1-2: Clinic Onboarding

Complete all 7 required pre-launch checks for the target clinic:

Check What's Needed Who Provides It
1. Clinic info Name, phone, address, timezone Clinic admin
2. Business hours Mon-Fri hours, closed days Clinic admin
3. Providers Provider names + OSCAR provider IDs Clinic admin + OSCAR
4. EMR connection OSCAR SOAP URL + credentials, verified connectivity VitaraVox team
5. Vapi phone Assign Telnyx number, configure in Vapi squad VitaraVox team
6. Privacy officer Name + email (PIPEDA requirement) Clinic admin
7. Encrypted credentials Verify AES-256-GCM encryption works for stored OSCAR creds Automated check

Run onboarding validation:

curl -s https://api-dev.vitaravox.ca/api/admin/clinics/{clinicId}/onboarding \
  -H "Authorization: Bearer {token}" | jq '.data.checks'

All 7 required checks must show passed: true.

Day 3: Backup Verification

Task Detail
Test backup script Run bash /home/ubuntu/vitara-platform/scripts/backup-db.sh manually
Test restore Create test database, pg_restore from latest backup, verify data integrity
Verify cron crontab -l shows daily 2:00 AM backup
Test off-site copy scp or rsync latest backup to a second location (even another directory is better than nothing)

Day 4: Monitoring Setup

Task Detail
Uptime Kuma health check Verify GET /health is monitored, alerts fire on failure
PM2 error monitoring Set up pm2 monit or a simple cron that checks pm2 jlist for stopped status
Slack/email alerts Configure Uptime Kuma to notify on downtime (Slack webhook or email)
OSCAR connectivity alert Health endpoint already checks OSCAR — verify it reports degraded when OSCAR is unreachable

Simple PM2 watchdog (cron every 5 min):

#!/bin/bash
# /home/ubuntu/vitara-platform/scripts/pm2-watchdog.sh
STATUS=$(pm2 jlist | jq -r '.[0].pm2_env.status')
if [ "$STATUS" != "online" ]; then
  echo "ALERT: vitara-admin-api is $STATUS" | \
    curl -X POST -d "$(cat -)" https://hooks.slack.com/services/YOUR/WEBHOOK/URL
  pm2 restart vitara-admin-api
fi

Day 5: Operational Runbook

Create a single-page runbook for the on-call person (you, for now):

Scenario Action
Server unreachable SSH to OCI, check pm2 status, restart if needed: pm2 restart vitara-admin-api
OSCAR SOAP timeout Check /health endpoint. If OSCAR is down, nothing to do — circuit breaker protects. Notify clinic.
Vapi webhook errors Check pm2 logs vitara-admin-api --lines 50. Look for auth failures or 500s.
Database connection refused sudo systemctl status postgresql. If down: sudo systemctl start postgresql
SSL certificate expired sudo certbot renew && sudo nginx -s reload
Need to see PHI for debugging POST /api/admin/debug {"enabled": true} — auto-expires in 4 hours
PM2 keeps restarting pm2 logs vitara-admin-api --err --lines 100 to find crash cause. May need to roll back last deploy.
Patient booked wrong slot Check audit_logs + OSCAR directly. Manual fix in OSCAR admin UI.
Need to disable voice agent Set clinic status to inactive in admin dashboard. Calls go to transfer number.

Week 4 (Mar 12-18): Staging Calls + Buffer

Theme: Simulate real usage. Fix what breaks. Keep buffer for surprises.

Day 1-2: Staging Call Marathon

Run 20+ end-to-end calls simulating real patients:

Call Type Count Variations
New patient booking 5 Morning slot, afternoon slot, specific doctor, any doctor, next available
Existing patient booking 3 Phone lookup success, phone lookup fail → name search
Reschedule 3 Pick from list, change doctor, change week
Cancel 2 With reason, without reason
Registration (new patient) 3 Full flow with health card, without health card, add to waitlist
Edge cases 4 Emergency keywords, request human, caller hangs up mid-flow, no available slots

For each call, verify:

  • [ ] Patient identified correctly (or fallback to name search works)
  • [ ] Appointment booked/modified in OSCAR (check OSCAR admin UI)
  • [ ] CallLog written to database with correct metadata
  • [ ] No duplicate bookings (check OSCAR schedule view)
  • [ ] Conversation sounded natural (not robotic, no dead air >3s)
  • [ ] Call ended cleanly (log_call_metadata fired)

Day 3: Fix Issues from Staging Calls

Reserve this entire day for fixing whatever the staging marathon reveals. Common patterns:

  • Prompt tweaks (LLM says the wrong thing in edge cases)
  • Timing issues (dead air during tool calls → add request-response-delayed messages)
  • OSCAR data issues (provider IDs don't match, schedule not configured)

Day 4-5: Buffer

Do not schedule work here. This is your safety net for:

  • Issues discovered during staging that take longer than a day
  • Clinic admin delays in providing information
  • OSCAR configuration issues on the clinic side
  • Last-minute Mandarin fixes if it was conditionally included

If nothing goes wrong (unlikely), use this time for: - Writing a "What's New" email to the clinic staff - Updating the changelog - Setting up a post-launch check-in schedule with the clinic


Go-Live: March 30 Week (Mar 19-30)

Pre-Launch Checklist (Mar 19)

Security:
  [  ] JWT_SECRET is not a dev default
  [  ] DATABASE_URL uses strong password
  [  ] ENCRYPTION_KEY encrypts/decrypts correctly
  [  ] VAPI_WEBHOOK_SECRET is set and enforced
  [  ] toolCallId idempotency is active
  [  ] Webhook tool calls are audited

Voice Quality:
  [  ] All 9 prompts have CONVERSATION STYLE sections
  [  ] transfer_call available on all agents
  [  ] handoff_to_router_v3 on Registration agents
  [  ] SOAP clients warm on startup
  [  ] Mandarin decision made and implemented

Clinic Configuration:
  [  ] All 7 onboarding checks pass
  [  ] OSCAR SOAP connection verified
  [  ] Clinic hours + holidays configured
  [  ] Providers mapped to OSCAR IDs
  [  ] Privacy officer documented
  [  ] Vapi phone number assigned and tested

Operations:
  [  ] PM2 restart count stable (<5 in 48h)
  [  ] Backup tested with successful restore
  [  ] Uptime Kuma monitoring active
  [  ] Alert notifications working (Slack/email)
  [  ] Runbook written and accessible
  [  ] Data retention job running (3:00 AM daily)

Soft Launch (Mar 24-28)

  • Day 1: Enable for clinic staff only (internal testing with real OSCAR data)
  • Day 2-3: Enable for first 10% of callers (route subset of calls to Vapi number)
  • Day 4-5: Monitor call logs, fix issues, expand to 50%

Full Launch (Mar 30)

  • Route all calls to Vapi number
  • Monitor first 4 hours actively (watch PM2 logs + call logs in real time)
  • Have OSCAR admin UI open to verify appointments are landing correctly
  • Keep clinic's original phone line as instant rollback (just revert the phone routing)

Mandarin Decision Matrix

Make the call by end of Week 2 (March 4).

                    ┌──────────────────────────────┐
                    │  Mandarin Testing Results     │
                    │  (8 scenarios tested)          │
                    └──────────┬───────────────────┘
                    ┌──────────┴───────────┐
                    │                      │
               7-8 pass                 5-6 pass              <5 pass
                    │                      │                      │
                    ▼                      ▼                      ▼
         ┌──────────────┐     ┌───────────────────┐   ┌──────────────────┐
         │ LAUNCH WITH  │     │ LAUNCH WITH       │   │ ENGLISH ONLY     │
         │ FULL ZH      │     │ DOCUMENTED        │   │                  │
         │              │     │ LIMITATIONS        │   │ Disable ZH in   │
         │ No changes   │     │                   │   │ Router prompt    │
         │ needed       │     │ Add warning to    │   │                  │
         │              │     │ clinic: "Mandarin │   │ "Mandarin support│
         │              │     │ may have accent   │   │  coming soon"    │
         │              │     │ recognition       │   │                  │
         │              │     │ limitations"      │   │ Add to v3.1      │
         └──────────────┘     └───────────────────┘   │ roadmap          │
                                                       └──────────────────┘

Advisory Items Mapped to This Plan

From Security Analysis (30 findings)

Finding Severity Week Action
#1 Hardcoded JWT defaults CRITICAL Week 1 Rotate all secrets
#2 Dev mode auth skipped CRITICAL Week 1 Verify production mode is enforced
#6 Encryption key not enforced CRITICAL Week 1 Verify key works
#10 Missing idempotency MEDIUM Week 1 Add toolCallId dedup table
#19 Missing webhook audit MEDIUM Week 1 Add tool-call audit logging
#3 VAPI_API_KEY in webhook HIGH Deferred Low risk for single clinic
#4 metadata.clinicId unvalidated HIGH Deferred Single clinic = low risk
#7 Token management gaps HIGH Deferred Acceptable for pilot
#12 Rate limiting bypasses MEDIUM Deferred Single clinic = low traffic

From Infrastructure Advisory

Item Week Action
6666 PM2 restarts Week 1 Reset counter, add restart limits, switch to compiled JS
No centralized logging Deferred PM2 logs + Uptime Kuma sufficient for 1 clinic
Single server (no HA) Deferred Acceptable risk for pilot with monitoring
No disaster recovery plan Week 3 Verify backup + create runbook
Dev passwords in production Week 1 Rotate all

From P1 Fix List

Item Week Action
Restore CONVERSATION STYLE Week 2 Add to all 8 non-Router prompts
Slot collision in prompts Week 2 Add instruction to Booking + Modification
SOAP warmup on startup Week 2 Add warmSoapClients() to index.ts
transfer_call on Booking + Registration Week 2 Add to toolIds + squad YAML
handoff_to_router_v3 on Registration Week 2 Add to squad YAML

Explicitly Deferred (Post-Launch)

Item Why Deferred
Redis (distributed state) Single instance handles 1 clinic
ECS Fargate (auto-scaling) OCI ARM is sufficient for pilot volume
LiteLLM (LLM proxy) GPT-4o hardwired is fine for 1 clinic
LiveKit (voice pipeline) Vapi works, don't touch before launch
Observability (Datadog/Grafana) PM2 logs + health checks are enough for pilot
Multi-clinic timezone support Single clinic = one timezone
Canadian data residency (Azure OpenAI) Acceptable risk for pilot with BAA
WAF / DDoS protection Low traffic, low risk for pilot

Risk Register

Risk Likelihood Impact Mitigation
OSCAR SOAP connection fails on launch day Medium High Test daily during Week 3-4. Have clinic's direct line as fallback.
PM2 crash loop returns Low High Compiled JS + restart limits. Watchdog cron restarts and alerts.
LLM hallucinates appointment details Low High Server-side validation catches wrong dates/times. Slot collision check prevents double-booking.
Mandarin calls garbled Medium Medium Mandarin go/no-go decision by Mar 4. Easy to disable in Router prompt.
Clinic staff unfamiliar with system Medium Medium Pre-launch training call. Written runbook for common questions.
Patient data breach Very Low Very High Rotated secrets, encrypted creds, HMAC auth, audit trail, PHI redaction.

Post-Launch Roadmap (April+)

After successful pilot, sequence back to the enterprise stack plan:

Month Phase Trigger
April Phase 1: Redis Preparing for second clinic
May Phase 2: RDS + Observability Before third clinic
June Phase 3: ECS Fargate Before scaling past 5 clinics
July Phase 4: LiteLLM When per-clinic cost tracking needed
Q3-Q4 Phase 5: LiveKit When 5-language support or Vapi costs demand it