AI Evaluation Engineer

Distyl Inc.

Job Overview

Location

San Francisco

Job Type

Full-time

Full Job Description

📋 Description

• As an AI Evaluation Engineer at Distyl Inc., you will design and implement evaluation systems that drive AI system improvement through Evaluation-Driven Development, ensuring AI behavior is measured, trusted, and iterated upon in production environments for mission-critical customer applications.
• Your day-to-day responsibilities include building and maintaining golden test cases and regression suites in Python, developing offline and online evaluation pipelines, defining quality metrics aligned with business objectives, calibrating LLM-based graders against human judgment, and collaborating with cross-functional teams including Forward Deployed AI Engineers and domain experts to guide system design and deployment.
• Distyl is an applied AI technology company partnering with global enterprises in telecom, healthcare, insurance, manufacturing, and more to rearchitect critical operations using frontier AI. Backed by top-tier investors, the company achieves 100% production deployment success and operates profitably by tightly coupling measurement with AI development.
• In this role, you will deepen your expertise in AI system evaluation, gain hands-on experience with LLM-based grading and prompt iteration, influence real-world AI deployments at scale, and grow as a systems-oriented engineer who bridges measurement, model behavior, and production reliability.

Skills & Technologies

Python

Hybrid

$150k-250k

Ready to Apply?

Apply Externally

You will be redirected to an external site to apply.

Distyl Inc.

Visit Website

About Distyl Inc.

Distyl is a cloud-native platform designed to simplify and accelerate the development and deployment of machine learning (ML) models. It provides a unified environment for data preparation, model training, versioning, and deployment, enabling data scientists and ML engineers to move from experimentation to production faster. The platform offers features such as automated data pipelines, managed training infrastructure, and scalable model serving. Distyl aims to reduce the complexity and operational overhead associated with MLOps, allowing organizations to focus on building and deploying impactful ML solutions. It supports various ML frameworks and integrates with existing cloud infrastructure.

View Company Profile

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.