Conversation Capture: streaming audio in React Native.
This is the capture pipeline behind our clinical scribe. It records a live patient encounter on a clinician’s phone and streams it to the cloud reliably enough to draft a note, even on hospital Wi-Fi that drops. Here’s how we built low-latency capture, chunked buffering and resumable upload in React Native.
Listening to the world, in real time
A clinical encounter doesn’t pause for the software. Our clinical scribe has to capture a live conversation on a clinician’s phone and stream it to the cloud continuously: low latency, lossless, resilient to the dead zones every hospital has. Here’s the capture-and-streaming pipeline we built in React Native.
Most libraries handle local playback or recording. But continuous streaming, especially low-latency streaming, introduces an entirely new layer of complexity: buffering, threading, and native-bridge synchronization. Here is how we designed the system end to end.
What conversation capture means
Passively capturing audio from the environment and transmitting it to another endpoint, whether a remote server, a local analyzer, or a visualization module. Unlike simple recording, streaming is a continuous flow: small packets are captured, encoded, transmitted and consumed in near real-time.
Why this is hard in React Native
React Native isn’t designed for continuous audio streams out of the box. Four problems dominated the work.
Real-time capture
Getting microphone data continuously, on both Android and iOS.
Low-latency transport
Sending small audio chunks fast enough for a real-time experience.
Thread synchronization
The JS bridge introduces timing drift if threading is left unmanaged.
Platform inconsistency
iOS background permissions vs Android buffer behavior diverge.
The streaming architecture
Rather than a WebSocket, we transmit through Azure’s buffered store: individual chunks are updated by index via REST, and committed when the final chunk arrives. The stream exposes two operations: updateChunk to write a chunk at a specific index, and commitChunk to write the final size to the audio header and flag the last chunk.
updateChunk
Updates an individual chunk at a very specific index, idempotently.
commitChunk
Writes final size to the audio header and flags the final chunk of the file.
From a continuous stream to discrete chunks
The UploadBuffer manages the real-time streaming of data into uploadable chunks, turning a continuous source (audio, video, sensor data) into manageable units the queue can ship.
Stream-to-chunk transformation
Bounds memory during long recordings, creates network-optimal chunks, and enables granular progress + partial recovery.
Async write management
A sequential write queue guarantees ordering, never blocks the UI, isolates failures, and handles backpressure.
Dynamic file management
Append-only writes with precise offset tracking, and graceful recovery of files left from previous sessions.
The core write method defers file initialization until the first write, calculates chunk boundaries atomically, and queues every file operation to preserve order:
1public write(base64Data: string): void {2 const data = Buffer.from(base64Data, "base64");3 this.currentChunkSize += data.byteLength;4 5 this._writeQueue.queue(Math.random().toString(), async () => {6 if (this.currentFileOffset === undefined) {7 if (await FS.exists(this._uploadQueue.filePath)) {8 const { size } = await FS.stat(this._uploadQueue.filePath);9 this.currentFileOffset = size;10 } else {11 this.currentFileOffset = 0;12 }13 }14 15 await FS.appendFile(this._uploadQueue.filePath, base64Data, "base64");16 if (this.currentChunkSize >= this._chunkSize) {17 const offset = this.currentFileOffset;18 this.currentFileOffset += this._chunkSize;19 this.currentChunkSize -= this._chunkSize;20 this._uploadQueue.queue(offset, this._chunkSize);21 }22 });23}Lazy file initialization
File operations are deferred until the first write, avoiding unnecessary I/O and handling existing files gracefully.
Atomic chunk processing
Byte counts and offsets are updated atomically with chunk creation, then handed off to the upload queue.
Asynchronous coordination
All file operations are queued, preserving strict write ordering while never blocking on slow I/O.
Chunked, resumable, persistent uploads
Mobile uploads face unreliable networks, app backgrounding, and memory constraints. The UploadQueue is a chunked, resumable and persistent upload system built for exactly those conditions.
Chunked strategy
Files break into small chunks: memory-efficient, individually retryable, and progress-trackable.
Persistent state
Upload state survives app sessions: pending chunks, completion metadata, and error state.
Network-aware
Automatic pause when offline, resume on reconnect, and in-flight request cancellation.
1export interface UploadStream<T> {2 upsertChunk(data: string | Uint8Array, chunkIndex: number): Promise<void>;3 commitChunks(completionState: T): Promise<void>;4}1public static async create<T extends UploadStream<any>>(2 filePath: string,3 userPartition: string,4 targetStream: T,5) {6 const bufferedStorage = await BufferStorage.create<string, QueuedUploadStat>({7 storage: new AsyncStorage<QueuedUploadStat>(`${userPartition}-queuedUploads`),8 debounceMs: 200,9 });10 const queue = new UploadQueue<T>(filePath, bufferedStorage, targetStream);11 await queue.queuePendingChunks();12 return queue;13}1public queue(offset: number, length: number): void {2 const stat = this.stat();3 const index = this.lastIndex === undefined ? 0 : this.lastIndex + 1;4 stat.chunks.push({ offset, length, index });5 this.storage.set(this.storageKey, stat);6 this.lastIndex = index;7 this.queueUpload();8}1private async uploadNext(): Promise<void> {2 if (!(await this.networkConsumer.isOnline(true)) || this._paused) {3 this.stopUploads();4 return;5 }6 7 const stat = this.stat();8 const chunk = stat.chunks.shift();9 if (!chunk) {10 this.stopUploads();11 return;12 }13 14 // read chunk from file, upsert to the stream, retry on failure15}1private handleOnlineChange(isOnline: boolean): void {2 if (isOnline) {3 this.queueUpload();4 } else {5 this.uploadQueue.cancelAll();6 this.stopUploads();7 }8}- Failed chunks are returned to the front of the queue for immediate retry.
- Persistent error flags prevent infinite retry loops; backoff can be layered on.
- Sequential indexing guarantees chunks never upload out of order.
- A single active upload at a time prevents resource contention; reads stream on demand.
1// Create an upload queue for the audio file2const uploadQueue = await UploadQueue.create(3 audioFilePath,4 userPartition,5 new DialogueWav(apolloClients, mediaReference, audioFilePath),6);7 8uploadQueue.onProgressChanged((progress) => updateProgressBar(progress));9uploadQueue.onUploadingComplete(async () => {10 await processCompletedUpload();11 await cleanupTempFiles();12});13 14// Queue file chunks15const fileSize = await getFileSize(audioFilePath);16const chunkSize = 64 * 1024; // 64KB chunks17for (let offset = 0; offset < fileSize; offset += chunkSize) {18 const length = Math.min(chunkSize, fileSize - offset);19 uploadQueue.queue(offset, length);20}21 22// Finalize23uploadQueue.close({ dialogueId, duration: recordingDuration });Recording with react-native-audio-record
The capture layer uses react-native-audio-record for real-time recording, tuned for speech and wired directly into the upload pipeline.
1export const recorderOptions: RecorderOptions = {2 sampleRate: 16000, // optimal for speech processing3 channels: 1, // mono recording for efficiency4 bitsPerSample: 16, // standard quality for voice5 audioSource: 6, // platform-optimized microphone source6};Automatic safety limits: an 84-minute maximum with a 5-minute warning. Pause and resume preserve session state; silent gaps are detected automatically.
Audio flows directly from the native recording layer into the upload buffer, enabling real-time cloud streaming with no local accumulation:
1AudioRecord.on("data", (...props) => {2 uploadBuffer.write(...props); // stream to the upload system3 throttledUpdateAudioChunk(...props); // update UI visualization (100ms)4});- Throttled UI updates (100ms) prevent Android UI blocking while the waveform renders.
- A streaming architecture avoids accumulating audio data in memory.
- Keep-awake integration during active recording; cross-platform permission handling with graceful fallback.
iOS and Android pull in different directions
The same pipeline meets two very different runtimes. Designing for both from the start avoids surprises in production.
iOS
- Requires
UIBackgroundModes: ['audio']for background recording. - More consistent audio frame delivery.
- Fast file writes → minimal timing drift.
Android
- Requires the
RECORD_AUDIOpermission. - Throttle UI updates (waveform ~every 100ms).
- Avoid heavy JS work inside the data callback.
Lessons from timing-sensitive audio
- Throttle the JS bridge: send buffer chunks (20-40 ms) instead of per-sample calls.
- Binary WebSocket frames outperform base64 for high throughput.
- Keep UI updates separate: waveform rendering uses throttled state or off-thread animation.
- Use native modules for playback: JS decoding isn’t fast enough for 44.1kHz PCM.
- A backpressure strategy (dropping late chunks) keeps real-time latency bounded under load.
Toward a smarter clinical audio layer
Once the pipeline works, it becomes a foundation for clinical audio intelligence: the same architecture that powers real-time note drafting, assistive prompts and richer decision support.
Noise classification
Integrate models that label sounds and environments in real time.
Pattern triggers
Fire actions on sound intensity or recognized patterns.
Full-duplex audio
Use WebRTC for talkback and two-way features.
Cloud intelligence
Stream to speech recognition or anomaly detection services.
Real-time, built entirely in React Native
Building real-time clinical audio streaming pushed the limits of what React Native can do with real-time data. It took native audio APIs, resilient transport, and efficient visualization. But the result was a fully functional real-time capture layer, built entirely in React Native. Whether you’re building a clinical scribe, an AI sound analyzer, or a remote monitor, these same principles can power your architecture.