YANA Voice Chatroom Platform — Projects

At a Glance

Realtime mediaTwilio Voice

Async workersCelery + SQS

FrontendReact + Amplify

Backend splitWeb + Voice

The Problem

The goal was to create a room-based communication platform that felt immediate and social while remaining deployable on commodity cloud infrastructure. Real-time voice, text messaging, and asynchronous background work all had to coexist without overloading the synchronous web path.

Cloud Architecture

Details Inspector

Web Service (EC2 / Flask)

Edge

The web service is a stateless Flask application running on EC2. It handles room creation and lifecycle, user authentication, text message delivery via Socket.IO, and room state reads. Anything that would block the HTTP response — voice setup, push notifications — is handed off to SQS as an async task and acknowledged immediately. Multiple instances run behind the ALB; no in-process state is held between requests.

Why This Component

Statelessness is the property that makes horizontal scaling straightforward. When room membership and Socket.IO session state live in ElastiCache rather than instance memory, any web instance can handle any request. The SQS offload pattern means the web tier never slows down waiting for Twilio API calls — those latencies (50–200ms each) would otherwise directly inflate p99 response times.

Async by default for third-party calls

Every call to an external API (Twilio, SES, push notification services) is a latency and reliability dependency you are importing into your synchronous request path. Pushing these onto SQS converts a blocking dependency into a queued task: your HTTP response time is bounded by your own code, and a Twilio degradation event does not cascade into elevated error rates on the web tier.

Trade-offs

Statelessness makes scaling easy but requires ElastiCache to be highly available — if Redis goes down, web instances cannot resolve Socket.IO session affinity or room membership, causing presence data to be stale. The SQS offload adds end-to-end latency for voice setup (the user waits for Celery to pick up the task) but eliminates the risk of synchronous Twilio failures surfacing as HTTP 5xx.

Alternatives

WebSockets require sticky sessions if Socket.IO state is in-process, which is a common trap — it makes scaling hard and hides the statefulness problem rather than solving it. Moving to a fully async model (all actions queued, never a synchronous response with side effects) would be cleaner but significantly more complex to build and debug.

Sync vs. Async: Two Paths Through the System

The web service handles text messaging synchronously — authenticate, persist, read presence, broadcast, return. Voice setup follows an asynchronous path: the HTTP response returns immediately after enqueueing an SQS task, and the Celery worker delivers the Twilio access token to the client over the existing Socket.IO connection once it's ready. This separation keeps the synchronous path fast and prevents third-party API latency from blocking every room join.

SynchronousText messaging

1HTTP request arrivesFlask handles auth, validates input
2Persist to DocumentDBRoom state, message history
3Read presence from ElastiCacheRoom membership, Socket.IO session routing
4Broadcast via Socket.IOReal-time event to room members
5Return 200 to callerSynchronous path complete

AsynchronousVoice setup

1Enqueue task to SQSCelery task message serialized and sent
2HTTP responds immediatelyCaller does not wait for task completion
3Celery worker picks up taskAt-least-once delivery from SQS
4Call Voice ServiceGenerate Twilio access token, create TwiML room
5Deliver result via Socket.IOToken pushed to client over existing WS connection

Request Flows

Two flows illustrate how synchronous and asynchronous paths coexist. Joining a voice room traces the full async path — from the initial HTTP join through SQS task dispatch, Celery worker execution, Twilio token generation, and Socket.IO callback delivery. Sending a text message takes the fast synchronous path through DocumentDB and Socket.IO broadcast. Click any step to inspect the service interaction.

Key Decisions

1
Separated web and voice services so scaling and failure characteristics for chat versus voice traffic could evolve independently.
2
Used Celery workers with SQS to offload asynchronous work instead of forcing all event handling through synchronous app servers.
3
Put backend services in private subnets behind load balancers to keep the public attack surface narrow.
4
Combined managed AWS primitives with application-level real-time tooling to keep infrastructure understandable for a small team.

Outcomes

Delivered a functioning room-based platform for real-time voice and text communication during the COVID era.
Created a modular cloud architecture that could scale the web tier, voice tier, and worker tier independently.
Improved responsiveness by moving call setup and background processing into asynchronous worker flows.

Lessons Learned

1Real-time products become distributed systems very quickly — queues, workers, and failure isolation matter even at modest scale.
2Voice features are operationally different from standard web requests, so isolating them pays off.
3Security groups and private networking are easy to skip in prototypes, but they become essential as soon as you expose user communication.
4A small amount of asynchronous architecture goes a long way when the alternative is blocking on third-party APIs.

YANA - Clubhouse-like Voice Chatroom

At a Glance

The Problem

Cloud Architecture

React Client (Amplify + CloudFront)

Load Balancer (VPC)

Web Service (EC2 / Flask)

Voice Service (EC2 / Flask + Twilio)

Celery Worker Fleet (SQS)