P1 AI Inc. logo

Member of Technical Staff - Evals

Job Overview

Location

United States

Job Type

Full-time

Category

Software Engineering

Date Posted

May 12, 2026

Full Job Description

đź“‹ Description

  • • As a Member of Technical Staff - Evals at P1 AI Inc., you will play a critical role in ensuring the reliability, performance, and continuous improvement of Archie, the company’s AI engineer designed for engineering AGI. Your work will directly impact the validation of Archie’s capabilities against real-world engineering benchmarks, helping to ensure it learns and retains the skills needed to perform complex engineering tasks across industrial domains.
  • • Day to day, you will implement and operate systems for organizing, transforming, running, grading, and reporting on eval benchmarks; design and execute processes for developing and QA’ing evals with input from engineering experts and industrial partners; ensure evals integrate effectively within CI/CD pipelines for continuous benchmarking; create methods to detect AI-specific quality issues like hallucinations, stochasticity, and regressions; and serve as a technical leader in standardizing automated testing practices across the technology stack.
  • • You will join a small, high-performing team of top talent in deep learning, model-based engineering, and industrial applications, working closely with founding team members from OpenAI, DeepMind, and other leaders in AI and engineering. The company is mission-driven, backed by $23M in seed funding from Radical Ventures, and focused on deploying Archie across engineering teams in industrial companies worldwide.
  • • In this role, you will deepen your expertise in AI evaluation, test system design, and CI/CD integration while contributing to a pioneering effort in engineering AGI. You’ll gain experience collaborating with multidisciplinary stakeholders, shaping evaluation frameworks for next-gen AI systems, and operating in a fast-paced startup environment that values ownership, intellectual excellence, and shipping discipline.

🎯 Requirements

  • • Experience in constructing comprehensive test suites for software and/or AI systems, including coordinating the contributions of others
  • • Experience designing metrics to evaluate systems and visualize their performance, including differences across successive generations
  • • Proficiency in Python programming, complex modules and modern software development tools and practices (Git, CI/CD, etc.)
  • • Good communication skills with a variety of stakeholders (AI researchers, domain experts, application developers)
  • • Ability to thrive in a fast-paced, dynamic startup environment

🏖️ Benefits

  • • Healthcare, dental, and vision insurance
  • • 401k with employer matching
  • • Unlimited PTO
  • • Significant equity component as part of compensation
  • • Opportunity to work remotely (US or Canada) with periodic in-person co-working sessions in San Mateo

Skills & Technologies

Python
Go
Git
Senior
Remote
$170k-200k

Ready to Apply?

You will be redirected to an external site to apply.

P1 AI Inc. logo
P1 AI Inc.
Visit Website

About P1 AI Inc.

P1 AI is a software company that builds an AI-powered platform for enterprise revenue teams. Its product integrates data ingestion, predictive analytics, and workflow automation to help sales and marketing organizations prioritize leads, forecast revenue, and personalize outreach at scale. The system continuously learns from CRM, email, and engagement data to surface next-best actions and optimize pipeline performance for B2B companies.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

United Services Automobile Association logo

United Services Automobile Association

Philadelphia Metro, PA Home
Full-time
Expires Jun 1, 2026
Go
Onsite

1 month ago

Apply
Nicaragua - Managua
Contract
Expires May 29, 2026
Python
Remote

1 month ago

Apply
Toronto
Full-time
Expires May 29, 2026
Remote

1 month ago

Apply
❌ EXPIRED
Remote
Full-time
Expired Apr 10, 2026
Go
Remote
Degree Required

3 months ago

Apply