SOAP notes in 30 seconds: how Larinova generates clinical documentation

Every medical student learns the SOAP format. Subjective - what the patient tells you. Objective - what you observe and measure. Assessment - your clinical interpretation. Plan - what you're going to do about it. It's the universal structure of clinical documentation. But writing SOAP notes by hand after seeing 40 patients is why doctors burn out. Larinova generates them automatically from the consultation audio. Here's exactly how.
Step 1: Real-time voice capture
When a doctor taps Record in the Larinova app, audio capture begins immediately on-device. We stream audio chunks to Sarvam AI's Saaras speech-to-text engine in real time. There is no batch upload at the end - transcription happens continuously, which means the doctor sees words appearing on screen as they speak. The real-time streaming approach matters for two reasons. First, it gives the doctor confidence that the system is actually listening and understanding. Second, it allows our NLP pipeline to begin clinical extraction before the consultation ends, which is how we hit the 30-second target for note generation after the conversation finishes.
Step 2: Code-mixed speech-to-text
Sarvam's Saaras model handles the core transcription challenge: a doctor speaking Tamil with English medical terms embedded throughout. The model doesn't try to detect language boundaries and switch models - it processes the mixed input natively. Code-switching in Indian medical conversations happens at the word level, not the sentence level. A single clause might contain Tamil verbs, an English drug name, a Tamil quantity expression, and an English diagnosis. Saaras outputs a unified transcript that preserves the meaning regardless of which language each word belongs to.
Step 3: Clinical entity extraction
Once we have the raw transcript, our NLP layer identifies and tags clinical entities: symptoms, vitals mentioned verbally, medications, dosages, frequencies, diagnoses, procedures, and follow-up instructions. This is not simple keyword matching. When a patient says "two weeks-a headache irukku" (I've had a headache for two weeks), the system needs to understand that "two weeks" is a duration, "headache" is a symptom, and the Tamil grammatical frame tells us this is the patient's self-reported complaint - which belongs in the Subjective section.
Note
Our entity extraction model is fine-tuned on anonymized Indian clinical transcripts. It recognizes Indian drug brands (Dolo-650, Pantop-D, Glycomet), regional symptom descriptions, and the specific way Indian doctors communicate clinical findings verbally.
Step 4: SOAP structure mapping
With clinical entities tagged, the system maps each piece of information to the correct SOAP section. Patient-reported symptoms, history, and complaints go to Subjective. Vitals, examination findings, and test results go to Objective. The doctor's diagnostic statements and clinical reasoning go to Assessment. Prescriptions, lifestyle advice, referrals, and follow-up schedules go to Plan. The mapping isn't purely rule-based. Consultations are messy - a doctor might mention a medication while discussing the assessment, then circle back to a symptom they forgot. Our model handles non-linear conversations by building a complete clinical picture first, then organizing it into the SOAP structure.
Step 5: Prescription generation
Alongside the SOAP note, Larinova extracts the prescription as a separate structured document. Each medication gets its own entry with the drug name, dosage, frequency, duration, and any special instructions (before food, after food, with water). The system cross-references against a database of Indian pharmaceutical products to normalize drug names and flag potential issues - like if a doctor says "Combiflam" but the patient mentioned they're allergic to ibuprofen. This isn't a diagnostic tool and we don't override the doctor's judgment. It's a safety net that catches obvious conflicts during the documentation step.
Why 30 seconds and not instant?
The transcription happens in real-time, but the SOAP structuring and prescription extraction run as a final pass once the recording stops. This takes about 20-30 seconds depending on consultation length. We could push partial results faster, but we found that doctors prefer to see the complete, structured note rather than watching it assemble in real time. It feels more trustworthy. You stop the recording, glance at your phone for a few seconds, and the full SOAP note and prescription are ready for review.
The review-first principle
Every SOAP note and prescription Larinova generates is a draft. It appears in an editable interface where the doctor can modify any section before finalizing. We surface the note with clear section labels and inline editing - tap any paragraph to change it. Our internal testing shows that 85% of generated notes need zero edits for routine consultations. But the other 15% is where mistakes hide. Clinical documentation must be accurate. The doctor always has the final word, and the interface is designed to make review and editing as fast as possible.
The goal is not to replace the doctor's clinical judgment. It's to eliminate the mechanical labor of typing what they already said out loud.

