Fieldguide Inc. logo

AI Engineer, Quality (Evals)

Job Overview

Location

San Francisco, CA or Remote (USA)

Job Type

Full-time

Category

Data Science

Date Posted

April 22, 2026

Full Job Description

📋 Description

  • As an AI Engineer, Quality (Evals) at Fieldguide, you will own the evaluation infrastructure that ensures AI agents perform reliably at enterprise scale, building unified platforms, automated pipelines, and production feedback loops to evaluate new models against critical workflows within hours.
  • You will design and build observability systems, integrate with LangSmith and LangGraph, translate customer problems into agent behaviors, and orchestrate LLMs, tools, retrieval systems, and logic into reliable agent experiences while working at the intersection of ML engineering, observability, and quality assurance.
  • Fieldguide is a remote-first, Series B-backed AI startup headquartered in San Francisco, building software for audit and assurance professionals in cybersecurity, privacy, and financial audit, trusted by over 50 of the top 100 accounting and consulting firms to power mission-critical work.
  • You will gain ownership of large product areas, define evaluation standards for the engineering organization, advocate for evaluation-driven development, and directly impact leadership’s ability to communicate AI quality to boards and customers through high-visibility, production-focused work.

Ready to Apply?

You will be redirected to an external site to apply.

Fieldguide Inc. logo
Fieldguide Inc.
Visit Website

About Fieldguide Inc.

Fieldguide builds AI workflow automation software for advisory and assurance firms, combining generative AI, large language models and no-code orchestration to digitize risk assessments, evidence collection, report writing and continuous monitoring across SOC 2, PCI-DSS, HIPAA, HITRUST, ISO 27001 and other compliance frameworks. The cloud platform unifies client portals, project management, collaboration and analytics so CPA, cybersecurity and risk-consulting teams can reduce manual tasks, accelerate audits and scale advisory services while maintaining security and quality standards for mid-market to enterprise clients.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

Centific Global Technologies Pte. Ltd. logo

Centific Global Technologies Pte. Ltd.

Remote Work( USA)
Full-time
Expires May 26, 2026
Python
FastAPI
gRPC
+4 more

26 days ago

Apply
Kin Insurance, Inc. logo

Kin Insurance, Inc.

Remote
Full-time
Expires May 11, 2026
Python
R
GitHub
+4 more

1 month ago

Apply
Paris | Remote
Full-time
Expires May 20, 2026
Remote
€60k-75k

1 month ago

Apply
New York City
Full-time
Expires May 11, 2026
Onsite
Degree Required

1 month ago

Apply