Distyl Inc. logo

AI Evaluation Engineer

Job Overview

Location

San Francisco

Job Type

Full-time

Category

Software Engineering

Date Posted

April 24, 2026

Full Job Description

📋 Description

  • As an AI Evaluation Engineer at Distyl Inc., you will design and implement evaluation systems that drive AI system improvement through Evaluation-Driven Development, ensuring AI behavior is measured, trusted, and iterated upon in production environments for mission-critical customer applications.
  • Your day-to-day responsibilities include building and maintaining golden test cases and regression suites in Python, developing offline and online evaluation pipelines, defining quality metrics aligned with business objectives, calibrating LLM-based graders against human judgment, and collaborating with cross-functional teams including Forward Deployed AI Engineers and domain experts to guide system design and deployment.
  • Distyl is an applied AI technology company partnering with global enterprises in telecom, healthcare, insurance, manufacturing, and more to rearchitect critical operations using frontier AI. Backed by top-tier investors, the company achieves 100% production deployment success and operates profitably by tightly coupling measurement with AI development.
  • In this role, you will deepen your expertise in AI system evaluation, gain hands-on experience with LLM-based grading and prompt iteration, influence real-world AI deployments at scale, and grow as a systems-oriented engineer who bridges measurement, model behavior, and production reliability.

Skills & Technologies

Python
Hybrid
$150k-250k

Ready to Apply?

You will be redirected to an external site to apply.

Distyl Inc. logo
Distyl Inc.
Visit Website

About Distyl Inc.

Distyl is a cloud-native platform designed to simplify and accelerate the development and deployment of machine learning (ML) models. It provides a unified environment for data preparation, model training, versioning, and deployment, enabling data scientists and ML engineers to move from experimentation to production faster. The platform offers features such as automated data pipelines, managed training infrastructure, and scalable model serving. Distyl aims to reduce the complexity and operational overhead associated with MLOps, allowing organizations to focus on building and deploying impactful ML solutions. It supports various ML frameworks and integrates with existing cloud infrastructure.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

Singapore
Full-time
Expires Jun 2, 2026
Remote

22 days ago

Apply
❌ EXPIRED
Remote
Full-time
Expired Apr 13, 2026
Remote

2 months ago

Apply
❌ EXPIRED
Aquia Inc. logo

Aquia Inc.

Remote
Full-time
Expired Nov 24, 2025
Python
JavaScript
GitHub
+3 more

7 months ago

Apply
Livefront, Inc. logo

Livefront, Inc.

Remote (Colombia)
Full-time
Expires May 12, 2026
Remote
Degree Required

1 month ago

Apply