Pipecat Migration Analysis — Voice Platform Upgrade¶
Enterprise-critical migration analysis: Vapi.ai → Pipecat (open-source) + AWS ECS Fargate
| Attribute | Value |
|---|---|
| Status | PRE-PLANNING — Analysis complete, awaiting Phase A/B/C gate decisions |
| Date | February 23, 2026 |
| Version | 1.1 (Expert-reviewed & Amended) |
| Risk Level | CRITICAL — production voice platform swap affecting all patient calls |
| Classification | Enterprise Architecture — CTO Review |
| Review | Expert-reviewed by CTO Architect + Senior Voice AI Engineer |
| Source | vitara-platform/docs/PIPECAT-MIGRATION-PLAN.md |
Golden Rule
The current Vapi v3.0 system must remain operational through every phase. No exceptions.
Executive Summary¶
The Question¶
Should VitaraVox migrate from the current Vapi-managed voice platform (OCI ARM + Express.js) to a Pipecat + AWS ECS Fargate architecture for the production voice-enabled EMR system?
The Answer¶
Yes, with conditions. The migration is strategically sound but must be gated carefully. The current Vapi architecture has served well for rapid prototyping (v2.3→v3.0 in ~2 weeks) but introduces vendor dependency, compliance gaps, and cost scaling concerns that become untenable at enterprise scale.
Key Numbers¶
| Metric | Current (Vapi + OCI) | Target (Pipecat + AWS) |
|---|---|---|
| Platform cost/min | $0.05 (Vapi) + provider | $0.028 (infra) + provider |
| HIPAA add-on | $1,000/month (Vapi) | $500/month (Daily) or $0 (self-hosted) |
| Data residency | US (Vapi servers) | ca-central-1 (Montreal) |
| Vendor lock-in | High (proprietary API) | None (open-source BSD-2) |
| Turn detection control | Black box (endpointing ms) | Full control (Silero + Smart Turn + custom) |
| Self-host option | Limited | Full parity |
| Engineering effort | Low (already built) | 20-22 weeks migration |
v1.1 Amendment (Post Expert Review)¶
Revised Estimates
The original plan underestimated effort by ~54%. Key corrections:
- Total effort: 804 hours (not 520) / $120,600 (not $78,000) / 20-22 weeks (not 16)
- LLM: Azure OpenAI Canada Central (not Bedrock) — GPT-4o in Canada, automatic BAA
- Bedrock role: Optional — for Claude fallback only, NOT primary LLM
- STT/TTS dual-track switching: +40h — Pipecat Flows not designed for mid-call pipeline reconfig
- Call recording: +32h — no built-in recording in self-hosted Pipecat (needed for PHIPA)
- End-of-call reporting: +28h — must build transcript/summary/cost tracking from scratch
- Vapi rollback window: 60 days (not 30)
Recommended Approach: Gated Migration¶
| Phase | Timeline | Cost | What |
|---|---|---|---|
| A: HIPAA Fix | This week | $0 | Enable Vapi HIPAA mode + execute BAAs — production-ready in 2 days |
| B: POC | 2 weeks | ~$6K | Single-flow Pipecat POC to validate architecture |
| C: Full Migration | 20 weeks | ~$120K | Full migration — ONLY if Phase B succeeds |
| D: Hardening | 2 weeks | Included | Hardening, Vapi decommission after 60-day parallel run |
1. Current State Assessment¶
What's Built (Complete Inventory)¶
Voice Agent Layer (Vapi v3.0)¶
| Component | Count | Details |
|---|---|---|
| Squad | 1 | Dual-track EN/ZH, Squad ID: 13fdfd19... |
| Assistants | 9 | Router + Patient-ID (EN/ZH) + Booking (EN/ZH) + Modification (EN/ZH) + Registration (EN/ZH) |
| Tools | 14 | Each with request-start messages, YAML configs |
| Prompts | 9 | Gold-master prompts with clinic-agnostic design |
| GitOps | Full | vapi-gitops with slug-based references, state management, env separation |
| Phone | 1 | +1 236-305-7446 (Telnyx BYO) |
Server Layer (Express.js on OCI ARM)¶
| Component | Details |
|---|---|
| Framework | Express.js 4.18 + TypeScript 5 |
| ORM | Prisma 5.22 (PostgreSQL) |
| Adapters | IEmrAdapter → OscarSoapAdapter (1,212 lines) + OscarBridgeAdapter |
| Booking Engine | True availability (slots - appointments - filters), advisory locks |
| Auth | JWT (access + refresh), bcrypt, RBAC |
| Security | Helmet, CORS, rate limiting, HMAC webhook auth |
| Logging | Pino structured JSON, PHI redaction |
| Encryption | AES-256-GCM for stored credentials |
| Circuit Breakers | Opossum (4s timeout, 50% threshold, 30s reset) |
| Audit | PIPEDA 4.1.4 compliant audit logging |
Database Schema (PostgreSQL)¶
| Table | Purpose |
|---|---|
| clinics | Multi-tenant clinic profiles with Vapi phone mapping |
| clinic_config | EMR credentials (encrypted), booking settings, voice config |
| clinic_providers | Provider roster synced from OSCAR |
| clinic_hours / clinic_holidays | Operating hours and closures |
| call_logs | Voice call records with transcript, cost, outcome |
| audit_logs | PIPEDA compliance trail |
| users | JWT-authenticated admin users |
| waitlist | New patient waitlist |
| onboarding_progress | 9-step setup checklist |
What's NOT Built (Known Gaps)¶
- No BAA executed with Vapi or sub-processors
- No patient consent disclosure in voice agent
- HIPAA mode not enabled on Vapi
- No Canadian data residency
- No automated E2E voice testing
- No CI/CD pipeline
- No monitoring/alerting infrastructure
2. Target Architecture¶
High-Level Architecture¶
┌─────────────────────────────────────────────────┐
│ AWS ca-central-1 (Montreal) │
│ │
│ ┌──────────────┐ ┌────────────────────────┐ │
Phone Call ──────► │ │ Telnyx/Twilio │───►│ ALB (WSS/HTTPS) │ │
(PSTN/SIP) │ │ (Telephony) │ │ + WAF + ACM SSL │ │
│ └──────────────┘ └─────────┬──────────────┘ │
│ │ │
│ ┌────────────▼────────────┐ │
│ │ ECS Fargate Cluster │ │
│ │ ┌────────────────────┐ │ │
│ │ │ Pipecat Agent │ │ │
│ │ │ (Python Container) │ │ │
│ │ │ │ │ │
│ │ │ ┌─────┐ ┌────────┐ │ │ │
│ │ │ │ STT │ │ Smart │ │ │ │
│ │ │ │ │ │ Turn │ │ │ │
│ │ │ └──┬──┘ └────────┘ │ │ │
│ │ │ ▼ │ │ │
│ │ │ ┌─────┐ ┌────────┐ │ │ │
│ │ │ │ LLM │ │Pipecat │ │ │ │
│ │ │ │ │ │ Flows │ │ │ │
│ │ │ └──┬──┘ └────────┘ │ │ │
│ │ │ ▼ │ │ │
│ │ │ ┌─────┐ │ │ │
│ │ │ │ TTS │ │ │ │
│ │ │ └─────┘ │ │ │
│ │ └────────────────────┘ │ │
│ └────────────┬────────────┘ │
│ │ │
│ ┌──────────────────┼───────────┐ │
│ ▼ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────┐ │
│ │ API Server │ │ PostgreSQL │ │ S3 │ │
│ │ (Fargate) │ │ (RDS) │ │Audit │ │
│ │ │ │ │ │Logs │ │
│ │ OSCAR SOAP ──┼──┼──► OSCAR EMR │ │ │ │
│ │ Booking Eng │ │ │ │ │ │
│ │ Admin API │ │ │ │ │ │
│ └──────────────┘ └──────────────┘ └──────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ CloudWatch │ │ Secrets Mgr │ │
│ │ + Alarms │ │ + KMS │ │
│ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────┘
External AI Services (with BAAs):
├── Azure OpenAI (Canada Central) — LLM (GPT-4o)
├── Azure Speech (Canada Central) — STT/TTS for Chinese
├── Deepgram (US, BAA) — STT for English
└── ElevenLabs Enterprise (US, BAA) — TTS for English
Key Architecture Decisions¶
| Decision | Choice | Rationale |
|---|---|---|
| Compute | ECS Fargate (NOT Lambda) | Voice agents need persistent WebSocket (5-15 min sessions). Lambda's 15-min limit and cold starts are unsuitable. |
| Region | ca-central-1 (Montreal) | Canadian healthcare data residency. All required AWS services available. |
| Agent Framework | Pipecat with Pipecat Flows | Code-defined conversation graphs replace Vapi squad YAML. Full turn detection control. |
| LLM | Azure OpenAI (Canada Central) | GPT-4o in Canada. Same model as current Vapi — zero prompt rewriting. Fallback: Claude via Bedrock. |
| Telephony | Telnyx (existing) or Twilio | Keep existing phone number. Both support SIP to ECS Fargate. |
| Database | RDS PostgreSQL (ca-central-1) | Same Prisma schema, zero migration. Multi-AZ for HA. |
| IaC | CDK (TypeScript) | Full infrastructure as code. Replaces manual server setup. |
Why NOT Lambda?
Lambda has a 15-minute hard timeout — voice calls routinely exceed this. Lambda is also stateless (no persistent WebSocket) and cold starts add 1-5s latency. ECS Fargate provides persistent containers with ALB sticky sessions for the duration of each call.
Why Azure OpenAI, NOT Bedrock?
Azure OpenAI Canada Central has GPT-4o — the same model our 9 gold-master prompts are tuned for. Bedrock does not have GPT-4o. Using Bedrock would require rewriting all prompts for Claude or Nova, adding weeks of effort. Bedrock is valuable only as a Claude fallback.
3. CTO-Level Comparative Analysis¶
Strategic Dimensions¶
| Dimension | Vapi (Current) | Pipecat + AWS (Target) | Winner |
|---|---|---|---|
| Time to Market | Fastest (declarative API) | 20-22 weeks migration | Vapi |
| Total Cost of Ownership | $0.05/min + $1K/mo HIPAA + providers | $0.028/min + $0 HIPAA (self-hosted) + providers | Pipecat |
| Vendor Lock-in | High (proprietary squad API) | None (BSD-2-Clause, self-hostable) | Pipecat |
| Compliance Control | Trust Vapi's HIPAA mode (black box) | Full control: own code, own logs, own encryption | Pipecat |
| Data Residency | US only (no Canada region) | ca-central-1 Montreal | Pipecat |
| Turn Detection | Configurable endpointing (ms) | Silero VAD + Smart Turn (prosody analysis) | Pipecat |
| Scalability | Vapi manages (opaque) | ECS auto-scaling with reserved capacity | Pipecat |
| Debugging | Limited (HIPAA mode disables logs) | Full pipeline visibility (OpenTelemetry) | Pipecat |
| Testing/Eval | Manual testing only | aiewf-eval + Coval + Hamming + custom | Pipecat |
| Engineering Complexity | Low (API calls + webhooks) | High (Python + Docker + AWS + CDK) | Vapi |
| Operational Burden | Low (Vapi manages) | Medium (ECS, RDS, ALB, monitoring) | Vapi |
Break-Even Analysis¶
At 10,000 minutes/month (moderate clinic):
- Vapi: $500 platform + $1,000 HIPAA = $1,500/month (before providers)
- Pipecat Cloud: $280 infra + $500 HIPAA = $780/month
- Pipecat Self-hosted on AWS: ~$400 infra + $0 HIPAA = $400/month
At 100,000 minutes/month (10 clinics):
- Vapi: $5,000 platform + $1,000 HIPAA = $6,000/month
- Pipecat Self-hosted: ~$3,000 infra = $3,000/month
- Annual savings: $36,000
Technology Risk Assessment¶
| Risk Factor | Vapi | Pipecat |
|---|---|---|
| Company viability | VC-funded startup, competitive market | Daily has 10+ years in real-time video, profitable |
| Technology maturity | 2+ years in production | GA January 2026 (6 weeks old) |
| Community | Closed-source, vendor support | 10,000+ GitHub stars, 1,000+ beta teams |
| Exit strategy | Expensive migration if Vapi fails/pivots | Self-host with zero code changes |
4. Component Migration Map¶
Voice Agent Components¶
| Current (Vapi) | Target (Pipecat) | Approach | Effort |
|---|---|---|---|
| Router Assistant | Pipecat Flows Router Node | Python entry node with language detection | M |
| Patient-ID EN/ZH | Pipecat Flows Nodes | Two language-specific nodes with tools | M |
| Booking EN/ZH | Pipecat Flows Nodes | Convert booking logic to Python handlers | M |
| Modification EN/ZH | Pipecat Flows Nodes | Convert modification logic to Python | M |
| Registration EN/ZH | Pipecat Flows Nodes | Convert registration to Python | M |
| Squad YAML (GitOps) | Flows graph config (Python) | Redefine conversation flow as code | L |
| 14 Vapi tool definitions | Pipecat function handlers | Map tools to Python HTTP wrappers | L |
| request-start messages | say() in pre-action hooks |
Filler phrases in pipeline | S→M |
| assistantOverrides | Context management | APPEND / RESET / RESET_WITH_SUMMARY | S |
| Vapi phone number | Telnyx SIP → Pipecat | Configure SIP dial-in to ECS Fargate | M |
Effort Key: S = 1-2 days, M = 3-5 days, L = 1-2 weeks
Server Components¶
| Current | Target | Approach | Effort |
|---|---|---|---|
| Express.js webhook handler | Pipecat handlers + API server | Split: tools → Python, admin API → keep Node.js | L |
| OscarSoapAdapter (1,212 lines) | Keep as-is (Node.js) | Deploy as separate Fargate service | S |
| BookingEngine (669 lines) | Extract to shared service | REST API on Fargate, called by Pipecat + Admin | M |
| Prisma ORM + PostgreSQL | Same → RDS PostgreSQL | pg_dump → RDS restore. Zero code changes. |
S |
| PM2 process management | ECS task definitions | Fargate task + service definitions via CDK | M |
| .env file | AWS Secrets Manager | Migrate all secrets | S |
Critical: Do NOT Rewrite OSCAR Adapters in Python
The OscarSoapAdapter is 1,212 lines of battle-tested WS-Security SOAP code with dozens of OSCAR-specific quirks already solved (JAXB Calendar deserialization, hasTimeStamp: false, positional arg0/arg1 params). Python's zeep library behaves differently from node-soap. Keep the Node.js API server exactly as-is and call from Pipecat via HTTP (~2ms within VPC).
Context Management Strategy (Pipecat Flows)¶
| Transition | Strategy | Rationale |
|---|---|---|
| Router → Patient-ID | RESET |
Fresh start, Router context is minimal |
| Patient-ID → Booking/Mod/Reg | RESET_WITH_SUMMARY |
Preserve patient data, discard identification dialogue |
| Booking ↔ Modification | APPEND |
User switching intent mid-flow, full context needed |
5. Deployment Architecture (AWS ca-central-1)¶
Resource Map¶
VPC (10.0.0.0/16) — ca-central-1
├── Public Subnets (10.0.1.0/24, 10.0.2.0/24)
│ ├── ALB (Internet-facing, WSS + HTTPS)
│ ├── NAT Gateway (for private subnet outbound)
│ └── WAF (rate limiting, geo-blocking)
│
├── Private Subnets (10.0.3.0/24, 10.0.4.0/24)
│ ├── ECS Cluster
│ │ ├── Pipecat Agent Service (0.5 vCPU, 1GB, min 2 / max 20)
│ │ └── API Server Service (0.5 vCPU, 1GB, min 2 / max 10)
│ │
│ ├── RDS PostgreSQL (Multi-AZ, db.t3.medium, KMS encrypted)
│ └── ElastiCache Redis (optional, session state)
│
├── Secrets Manager (OSCAR creds, JWT, API keys)
├── S3 (audit logs 6yr → Glacier, call recordings, backups)
├── CloudWatch (log groups, metrics, alarms, dashboards)
├── EventBridge (data retention, health checks)
└── Lambda (cleanup, archival, health probes)
CI/CD Pipeline¶
GitHub Push → GitHub Actions
├── [Test] vitest + pytest
├── [Build] Docker multi-stage → ECR
├── [Deploy-Staging] ECS update → Coval smoke test (5 scenarios)
└── [Deploy-Prod] ECS update — manual approval → canary 10%→50%→100%
6. Compliance & Security Architecture¶
BAA Chain (Target)¶
Clinic (Covered Entity)
│
├── VitaraVox (Business Associate) ← BAA with clinic
│ ├── AWS (Subprocessor) ← AWS BAA
│ ├── Azure (Subprocessor) ← Microsoft DPA (automatic BAA)
│ ├── Deepgram (Subprocessor) ← Deepgram BAA
│ ├── ElevenLabs (Subprocessor) ← Enterprise BAA
│ └── Telnyx (Subprocessor) ← Telnyx BAA
│
└── OSCAR EMR ← Existing clinical relationship
Security Controls Matrix¶
| Control | Implementation | HIPAA | PHIPA | PIPA |
|---|---|---|---|---|
| Encryption in transit | TLS 1.3 (ALB), DTLS-SRTP (WebRTC) | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| Encryption at rest | KMS (RDS, S3, Secrets Manager) | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| Access control | IAM roles, JWT, RBAC | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| Audit logging | CloudWatch → S3 (6yr retention) | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| PHI redaction | Pino redaction logic preserved | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| Data residency | All resources in ca-central-1 | N/A | :white_check_mark: | :white_check_mark: |
| Consent disclosure | Voice disclosure at call start | N/A | :white_check_mark: | :white_check_mark: |
| Breach detection | CloudTrail + GuardDuty | :white_check_mark: | :white_check_mark: | :white_check_mark: |
Consent Flow¶
1. Call arrives → Pipecat answers
2. Router node plays consent disclosure:
EN: "Welcome to [clinic]. This call is assisted by AI for scheduling.
Your information is kept private and secure. Continue?"
ZH: "欢迎致电[诊所]。本次通话由AI辅助进行预约服务。
您的信息将被保密和安全处理。请问您是否愿意继续?"
3. "No" → Transfer to staff (SIP transfer)
4. "Yes" → Proceed to Patient-ID flow
5. Consent flag stored in call metadata
7. Cost Analysis¶
Monthly Cost Comparison (10 Clinics, ~50K minutes/month)¶
| Line Item | Current (Vapi + OCI) | Target (Pipecat + AWS) |
|---|---|---|
| Platform/Infra | ||
| Vapi platform fee | $2,500 | $0 |
| Vapi HIPAA add-on | $1,000 | $0 |
| OCI ARM instance | ~$50 | $0 |
| ECS Fargate (agents + API) | $0 | ~$600 |
| RDS PostgreSQL | $0 | ~$150 |
| ALB + NAT + S3 + CloudWatch | $0 | ~$125 |
| Subtotal Infra | $3,550 | $875 |
| AI Providers (at cost) | ||
| LLM (GPT-4o) | ~$1,500 | ~$1,500 |
| STT (Deepgram/Azure) | ~$500 | ~$500 |
| TTS (ElevenLabs/Azure) | ~$400 | ~$400 |
| Subtotal Providers | $2,400 | $2,400 |
| TOTAL | $5,950/month | $3,275/month |
| Annual | $71,400 | $39,300 |
| Annual Savings | — | $32,100 (45%) |
ROI Calculation¶
- Migration cost: $120,600 (one-time, revised estimate)
- Annual savings: $32,100 (at 50K min/month)
- Payback period: ~45 months at 50K min/month
- At 100K min/month: Payback period: ~22 months
- Savings improve dramatically with scale
8. Migration Phases & Effort¶
Phase Summary (Revised v1.1)¶
| Phase | Weeks | Hours | Cost ($150/hr) | Deliverable |
|---|---|---|---|---|
| 0: Foundation | 2 | 80 | $12,000 | AWS infra, DB migrated, CI/CD skeleton |
| 1: Core Agent | 6 | 264 | $39,600 | 9 Pipecat nodes + 14 tools + telephony |
| 2: Platform & Testing | 4 | 160 | $24,000 | Multi-tenant, monitoring, 76 test cases |
| 3: Shadow & Cutover | 4 | 80 | $12,000 | Parallel run, A/B testing, production switch |
| 4: Hardening | 2 | 40 | $6,000 | DR testing, security audit, docs |
| Buffer | 2 | 180 | $27,000 | Expert-review gap coverage |
| TOTAL | 20-22 | 804 | $120,600 |
Expert-Identified Effort Gaps¶
| Gap | Hours | Why Missed |
|---|---|---|
| STT/TTS dual-track switching (EN↔ZH) | +40h | Pipecat Flows not designed for mid-call pipeline reconfig |
| Call recording (custom FrameProcessor) | +32h | Vapi provides recordingUrl; self-hosted does not |
| End-of-call reporting (transcript/summary/cost) | +28h | Must build from scratch without Vapi |
| Telephony integration (Telnyx WS + call transfer) | +24h | Self-hosted Telnyx WS poorly documented |
| Python/Pipecat ramp-up | +40h | Async pipeline model differs from Express.js |
| Filler phrase timing differences | +16h | Vapi plays during tool exec; Pipecat plays before |
9. Automation Strategy¶
What CAN Be Automated¶
| Component | Method | Effort Saved |
|---|---|---|
| Infrastructure provisioning | CDK (TypeScript) | 90% |
| Database migration | pg_dump → RDS restore | 100% |
| Secret migration | Script: .env → Secrets Manager | 100% |
| Docker builds + deployments | GitHub Actions | 100% |
| SSL certificates | ACM auto-renewal | 100% |
| Auto-scaling | ECS service auto-scaling | 100% |
| Audit log archival | EventBridge → Lambda → S3 | 100% |
| Voice agent testing | Coval/Hamming scenarios | 80% |
What CANNOT Be Automated¶
| Component | Why |
|---|---|
| Prompt engineering | Creative, iterative — same 9 prompts need re-tuning for Pipecat |
| BAA execution | Legal process with each vendor |
| Turn detection tuning | Empirical optimization with real calls (2 weeks) |
| OSCAR SOAP quirks | Per-instance variations require manual adaptation |
| Consent language review | Lawyer review of disclosure text |
| Privacy Impact Assessment | Regulatory requirement, external consultant |
10. Testing & Quality Controls¶
Voice Quality Targets¶
| Metric | Target | Alert Threshold |
|---|---|---|
| Voice-to-voice latency | < 1500ms P50 | > 2000ms P50 |
| First response time | < 2000ms P95 | > 3000ms P95 |
| Tool call round-trip | < 4000ms P95 | > 5000ms P95 |
| Call success rate | > 90% | < 85% |
| Booking completion rate | > 80% | < 70% |
| Language detection accuracy | > 95% | < 90% |
| Turn detection accuracy | > 90% | < 85% |
Test Matrix¶
| Test Type | Tool | Automation | Frequency |
|---|---|---|---|
| Unit (Python) | pytest | Full | Every push |
| Unit (Node.js) | vitest | Full | Every push |
| Voice E2E | Coval | Full | Every deploy |
| Load | k6/Locust | Semi-auto | Weekly |
| Security | OWASP ZAP | Full | Weekly |
| Compliance | Custom checklist | Manual | Monthly |
| Regression | Coval (76 cases) | Full | Every deploy |
| Chaos | AWS FIS | Semi-auto | Monthly |
11. Devil's Advocate Review¶
"Why not just fix Vapi's compliance gaps?"¶
Valid. Vapi can be made HIPAA-compliant: enable HIPAA mode ($1K/month), execute BAAs, add consent disclosure, implement server-side audit logging. Cost: ~$15K + 2 weeks. Limitation: no Canadian data residency.
Verdict: Right approach if Canadian data residency is not required.
"Pipecat Cloud is only 6 weeks old. Ready for healthcare?"¶
Legitimate concern. Daily (parent company) has 10+ years in real-time infrastructure. 1,000+ beta teams over 9 months. Self-hosting eliminates Pipecat Cloud dependency entirely — you use the open-source framework on your own AWS.
"The team knows Node.js, not Python."¶
True. Hybrid approach mitigates: Pipecat agents in Python (thin HTTP wrappers), API server stays Node.js (all business logic). The tool handlers are wrappers — complexity stays in Node.js.
"ECS Fargate is more expensive than OCI ARM."¶
True at current scale ($600/mo vs $50/mo). But OCI ARM is a single point of failure with no auto-scaling, no multi-AZ, no native health checks. The $550/month premium buys HA and managed infrastructure.
"Why not Pipecat Cloud instead of self-hosting?"¶
Pipecat Cloud has no Canadian region (only us-west, us-east, eu-central, ap-south). For Canadian healthcare: self-hosted on AWS ca-central-1 is required.
12. Risk Register¶
Top Risks by Severity¶
| ID | Risk | Prob | Impact | Severity | Mitigation |
|---|---|---|---|---|---|
| C1 | BAA execution delays | High | High | Critical | Start BAA process immediately. Parallel-track all vendors. |
| C5 | PHI leak in logs/metrics | Medium | Critical | Critical | Automated log scanning in CI/CD. Redaction verified every deploy. |
| T1 | Pipecat framework bug | Medium | High | High | Self-host allows code inspection. Pin stable version. Keep Vapi fallback. |
| T3 | Turn detection regression | Medium | High | High | A/B test configs. 2-week tuning budget. Fallback to transcription-based. |
| T4 | Telephony integration fails | Low | Critical | High | Telnyx documented SIP. Twilio as fallback. Test in staging. |
| F1 | Migration takes longer | High | Medium | High | 20% buffer. Phase gates with go/no-go. |
| R2 | Scope creep | High | Medium | High | Strict feature freeze during migration. |
| O2 | Provider API outage | Medium | High | High | Azure fallback for all providers. Circuit breaker auto-switches. |
13. Decision Framework¶
Go / No-Go Criteria¶
Proceed with migration if ALL true:
- Canadian data residency is required for target clinics
- Expected scale exceeds 50K minutes/month within 12 months
- Budget of $120-145K is available
- At least one engineer can commit full-time for 22 weeks
- BAA process can begin immediately
Stay on Vapi if ANY true:
- No Canadian data residency requirement
- Scale stays under 20K minutes/month
- Budget constrained below $40K
- No Python engineering capability
- Production deadline is less than 8 weeks away
Phase Gate Decisions¶
| Gate | Criteria | Decision |
|---|---|---|
| Phase 0 → 1 | Infrastructure deployed, DB migrated, CI/CD working | Go/No-Go |
| Phase 1 → 2 | Single booking flow works end-to-end on Pipecat | Go/No-Go |
| Phase 2 → 3 | All 76 test cases passing, multi-tenant working | Go/No-Go |
| Phase 3 → 4 | Shadow testing shows ≥ parity with Vapi on all metrics | Go/No-Go |
Immediate Actions (Regardless of Migration Decision)¶
| Action | Timeline | Cost |
|---|---|---|
| Execute BAA with Vapi | This week | $1,000/month |
| Enable Vapi HIPAA mode | This week | Included |
| Add consent disclosure to Router | This week | 2 hours |
| Verify sub-processor BAAs | This week | $0 |
| Sign AWS BAA | This week | $0 |
| Begin Pipecat POC (parallel) | Next week | 40 hours (~$6K) |
Appendices¶
A. Pipecat Flows Conversation Graph¶
[START]
│
▼
[Consent Node] ──── "No" ──── [Transfer to Staff]
│
"Yes"
│
▼
[Router Node]
│ detect language + intent
│
├── EN:book/reschedule ──► [Patient-ID-EN] ──► [Booking-EN]
├── ZH:book/reschedule ──► [Patient-ID-ZH] ──► [Booking-ZH]
├── EN:cancel ──► [Patient-ID-EN] ──► [Modification-EN]
├── ZH:cancel ──► [Patient-ID-ZH] ──► [Modification-ZH]
├── EN:register ──► [Registration-EN]
├── ZH:register ──► [Registration-ZH]
└── unknown ──► [Transfer to Staff]
B. Provider Fallback Chain¶
Primary Fallback 1 Fallback 2
───────────────────── ─────────────────── ──────────────────
STT EN: Deepgram Azure Speech (CA) Google STT (CA)
STT ZH: Azure Speech AssemblyAI Universal Deepgram zh
LLM: Azure OpenAI Claude via Bedrock GPT-4o (US)
TTS EN: ElevenLabs Cartesia Azure TTS
TTS ZH: Azure TTS ElevenLabs Multi Cartesia
C. Azure OpenAI vs AWS Bedrock¶
| Dimension | Azure OpenAI (Canada Central) | AWS Bedrock (ca-central-1) |
|---|---|---|
| GPT-4o availability | YES | NO |
| Canadian data residency | YES | YES |
| BAA | Automatic (Microsoft DPA) | Automatic (AWS BAA) |
| STT/TTS in same region | YES (Azure Speech Canada) | NO |
| Single vendor for LLM+STT+TTS | YES | NO |
| Current model compatibility | Same model as Vapi | Would need prompt rewrite |
Verdict: Azure OpenAI for primary LLM. Bedrock for Claude fallback only.
D. Expert Review Critical Findings¶
- Effort underestimated by 54%: 520h → 804h / $78K → $120.6K
- Context management strategy needed per transition: RESET, RESET_WITH_SUMMARY, APPEND
- EN/ZH STT/TTS switching is non-trivial: Custom
ServiceSwitcherpattern needed - Call recording must be Day 1: PHIPA compliance requires audit trail
- Filler phrase timing differs: Need custom processor between LLM and function executor
- Do NOT rewrite OSCAR adapters: 1,212 lines of battle-tested code, keep Node.js
- POC-first approach: Validate with $6K before committing $120K
Analysis generated February 23, 2026 by Claude Opus 4.6 based on comprehensive review of VitaraVox v3.0 codebase, Pipecat Cloud documentation, AWS ca-central-1 services, and HIPAA/PHIPA/PIPA compliance requirements. Expert-reviewed by CTO Architect and Senior Voice AI Engineer agents.