ML Engineer - Product
Remote
Adalat AI is building an end-to-end justice tech stack that automates manual and clerical pain points in courtrooms, giving judges back time to focus on what matters most: decision-making and delivering justice. Our solutions - from AI-powered transcription in Indian languages to case-flow management and document navigation - are now deployed across 9 states, covering nearly 20% of India’s judiciary. Backed by leading technology companies and funders, and incubated at MIT and Oxford, Adalat AI is working to eliminate judicial delays and expand access to timely justice. Founded by a team with backgrounds in law, technology, and economics from Harvard, Oxford, MIT, and IIIT Hyderabad, we are scaling rapidly across India and the Global South.
Role Overview
Adalat AI's ML models don't run in a research lab — they run in courtrooms, for judges who need them to work. A model that achieves strong benchmark numbers is not the same as a model that earns a judge's trust during a seven-hour sitting. The gap between those two things is what you will close.
In this role, you'll own the product-facing ML systems that make our research-grade models useful in practice. That means retrieval pipelines, search agents, legal reasoning systems, and feedback infrastructure — built to work reliably across noisy, multilingual, multi-document workflows. You'll come in with a specialisation in either Text or Speech, or as a generalist who can work across both. Either works. What the role really asks is that you care about whether the system works for real users, and that you have the engineering depth to find out.
Text-side: retrieval and search agents for multimodal, multi-document, and multilingual legal workflows; legal reasoning over court documents and case history; translation and summarisation for judges and stenographers.
Speech-side: production transcription pipelines, diarization, and voice-to-action — turning spoken courtroom interactions into structured, actionable outputs.
Key Responsibilities
1. Build and own product-facing ML systems
Build retrieval and search agents that work across multilingual, multi-document legal corpora.
Develop legal reasoning pipelines that help judges and lawyers navigate case history, orders, and filings.
Build production speech systems: transcription, diarization, and voice-to-action workflows for live courtroom use.
Own translation and summarisation pipelines for legal proceedings across Indian languages.
2. Build and maintain evaluation infrastructure
Design eval harnesses that go beyond standard metrics and measure what actually matters in a courtroom context.
Build feedback pipelines that surface real-world model failures back to the research team.
Own the question: "Does this model work for the people using it?" — and have the infrastructure to answer it.
3. Drive domain adaptation
Translate raw courtroom data — audio, transcripts, documents — into training signal for model improvement.
Identify and close the gap between how models behave in development and how they perform in the field.
Work with the Data Operations team to design annotation pipelines that generate targeted supervision.
4. Bridge research and product
Take what researchers build and make it shippable; translate production failure modes back into concrete research directions.
Surface distribution shifts and edge cases from the field before they become incidents.
Be the person the product team trusts to know what the ML stack can do, and the person researchers trust to know what the product actually needs.
Qualifications
Must have
2–6 years building ML systems in production.
Strong Python and comfort with the modern ML stack (PyTorch, HuggingFace, and relevant tooling).
Hands-on experience with at least one of: LLM fine-tuning and deployment, ASR/speech pipelines, RAG and retrieval systems, or NLP information extraction.
Experience designing and running model evaluations — deciding what to measure, not just running benchmarks.
Comfort shipping and maintaining production code, not just prototypes.
Strong plus
Deep specialisation in Speech (ASR, diarization, multilingual audio) or Text (LLMs, retrieval, legal NLP).
Experience with Indic languages.
Prior work in legal, healthcare, government, or civic tech.
Publications at ML venues.
What You Will Achieve in a Year
In your first year, you'll have ML systems running in live courtrooms that judges and lawyers actually rely on — transcription that holds up against noise, dialect variability, and eight-hour sessions, or legal search and reasoning tools that navigate complex documents reliably enough for a judge to trust before a hearing. You'll have built the eval infrastructure that tells you those systems are working in the field, not just on benchmarks.
You'll have closed at least one feedback loop: real courtroom failures feeding back into measurably better models. The researchers will trust you to land their work in the product. The product team will trust your judgment on what's ready. Between the two, you'll own what it means for ML to actually work at Adalat.
Benefits and Perks
WFH with flexible work hours.
Unlimited PTO
Autonomy and Ownership
Learning & Development resources
Smart, Humble and Friendly peers
Generous vacation
Maternity and Paternity leaves
Contacts within the Harvard / MIT/ Oxford ecosystem.