Data Ops Lead – Legal Data Intelligence
Remote
Development
About the Company
Adalat AI builds the AI infrastructure that powers India’s courts and, ultimately, judicial systems across the Global South. Our speech-to-text and reasoning models already run in ~3,000 courtrooms. High-quality, court-admissible legal data is the fuel for those models; the LDI vertical is how we produce it at scale.
Role Overview
You are the operational backbone of LDI. Partnering with lawyer-domain pods and ML model owners, you’ll translate machine-learning requirements into clear, bite-sized data tasks—then design the annotation platforms, pipelines, and QA loops that keep those tasks flowing smoothly, securely, and at scale. Your north star is simplicity: lawyers focus on legal judgment; you make the tooling disappear.
Key Responsibilities
Bridge ML ↔ Legal
Gather dataset specs from model owners (class balance, label granularity, latency, etc.).
Convert them into plain-language briefs, schemas, and checklists that lawyers can execute.
Own the Annotation Stack
Select, configure, and maintain user-friendly labeling tools; build plug-ins for legal-specific metadata (ratio vs obiter, precedent weight, outcome polarity).
Version guidelines; run onboarding and refreshers for annotators.
Pipeline & ETL Automation
Write and maintain scripts (Python/SQL) to ingest, deduplicate, segment, and track every statute, judgment, and pleading.
Ensure full lineage and court admissibility from raw text to final dataset.
Quality & Compliance
Implement dashboards for inter-annotator agreement, drift, and privacy checks; publish fortnightly QA scorecards.
Work with the Director-LDI to enforce licensing and statutory constraints.
Continuous Simplification
Identify bottlenecks, introduce templates or auto-suggestions, and ruthlessly de-scope tasks that add little model value.
Drive active-learning or weak-supervision pilots when they tangibly cut lawyer effort.
Qualifications
5+ years in data operations, annotation-platform management, or ML data engineering - ideally in a regulated domain (legal, healthcare, finance).
Strong hands-on ability with SQL, Python, and workflow/orchestration tools (Airflow, Prefect, etc.).
Proven record translating abstract stakeholder requirements into concrete, repeatable data tasks.
Demonstrated success driving ≥95 % inter-annotator agreement and meeting aggressive release timelines.
Clear communicator who can brief senior lawyers and ML engineers in equal measure.
Nice-to-Have Extras
Experience with legal-text corpora, e-discovery, or multilingual datasets.
Familiarity with LLM fine-tuning or embedding workflows (no need to train models yourself).
Exposure to privacy/bias audits under ISO 27701, GDPR, or upcoming Indian DPDP rules.
Competency in one Indian language beyond English/Hindi for multilingual QA.
Benefits and Perks
WFH with flexible work hours.
Unlimited PTO.
Contacts within the Harvard / MIT/ Oxford ecosystem.
Autonomy and Ownership
Smart, Humble and Friendly peers
Generous vacation
Maternity and Paternity leaves
Learning & Development resources
Know more about Adalat AI
Join Our Team
Send your resume and a short note on a data-ops problem you’ve solved to careers@adalat.ai with subject line “Data Ops Lead – LDI | .”