
Job Overview
Location
San Francisco
Job Type
Full-time
Category
Software Engineering
Date Posted
June 13, 2026
Full Job Description
đź“‹ Description
- • Build and scale the systems that power model and product evaluations across Harvey’s AI platform
- • Run intake, triage, and prioritization for the evaluation request queue, directing capacity to the highest-value coverage gaps
- • Embed evaluation workflows and readiness checkpoints directly into the product development lifecycle
- • Create and maintain the single source of truth for evaluation status, results, historical data, and launch readiness
- • Translate expert-designed evaluation methodologies into scalable, repeatable operational processes
- • Manage human data providers and establish an internal contract-attorney pipeline to ensure evaluation quality meets legal standards
- • Collaborate with Engineering and AI Research teams to improve evaluation tooling, automation, and data dashboards
- • Drive evaluation readiness for major product and model launches across multiple geographies and legal jurisdictions
- • Document and operationalize evaluation governance frameworks as complexity and scale increase
- • Define and implement processes to ensure model accuracy, reliability, and trust at a global scale
- • Work closely with Applied Legal Researchers, Product, Engineering, and AI Research to operationalize evaluation as a first-class product capability
- • Diagnose and resolve issues in evaluation pipelines, including writing documentation, troubleshooting data flows, and refining metrics
- • Apply an ROI-focused mindset to prioritize evaluation efforts and allocate resources efficiently
- • Translate technical evaluation nuances for diverse stakeholders including legal experts, product managers, and engineers
- • Maintain high standards of clarity, rigor, and reproducibility in all evaluation systems and documentation
- • Navigate ambiguity and bring structure to rapidly evolving evaluation requirements in a scaling AI company
- • Ensure compliance with jurisdiction-specific legal standards in evaluation design and execution
- • Support global expansion by adapting evaluation frameworks to regional legal and regulatory contexts
🎯 Requirements
- • 4–7+ years in technical program management, product operations, research operations, or evaluation/benchmarking roles
- • Experience working with ML/AI evaluations, benchmarking frameworks, or scientific workflows
- • Comfort with statistical methodologies and SQL or Python, or similar tools to interpret evaluation data (either natively or with AI tool support)
- • Strong business acumen with an ability to apply an ROI-focused mindset to scaling
- • Ability to work deeply with legal experts and operationalize complex evaluation methodologies
- • Strong cross-functional coordination skills across Product, Engineering, Research, and data providers/vendors
🏖️ Benefits
- • Opportunity to build the evaluation infrastructure of a global AI company at a critical inflection point
- • Work alongside world-class investors and a team committed to excellence, decisiveness, and simplicity
- • High ownership role with unmatched personal, professional, and financial growth potential
- • Collaborative environment focused on mission-driven work and pushing boundaries in legal AI
Skills & Technologies
See exactly how your profile matches this role — strengths, skill gaps, and what to do about them.
About Harvey AI Inc.
Harvey AI Inc. provides a generative artificial-intelligence platform engineered specifically for the legal profession. The software integrates with law-firm workflows to automate contract drafting, review, due-diligence and regulatory research, producing lawyer-quality language grounded in up-to-date statutes and precedents. Harvey combines large language models trained on legal corpora with secure, private-cloud deployment and firm-specific fine-tuning to maintain confidentiality and compliance. Clients range from global law firms to in-house legal departments seeking efficiency gains without compromising accuracy or security. The company was founded in 2022 and is headquartered in San Francisco, California.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities

Fieldguide Inc.
4 months ago

Xebia Poland Sp. z o.o.
4 months ago

Lilt Production
4 months ago

Lilt Production
4 months ago