
Job Overview
Location
United States
Job Type
Full-time
Category
Software Engineering
Date Posted
May 12, 2026
Full Job Description
đź“‹ Description
- • As a Member of Technical Staff - Evals at P1 AI Inc., you will play a critical role in ensuring the reliability, performance, and continuous improvement of Archie, the company’s AI engineer designed for engineering AGI. Your work will directly impact the validation of Archie’s capabilities against real-world engineering benchmarks, helping to ensure it learns and retains the skills needed to perform complex engineering tasks across industrial domains.
- • Day to day, you will implement and operate systems for organizing, transforming, running, grading, and reporting on eval benchmarks; design and execute processes for developing and QA’ing evals with input from engineering experts and industrial partners; ensure evals integrate effectively within CI/CD pipelines for continuous benchmarking; create methods to detect AI-specific quality issues like hallucinations, stochasticity, and regressions; and serve as a technical leader in standardizing automated testing practices across the technology stack.
- • You will join a small, high-performing team of top talent in deep learning, model-based engineering, and industrial applications, working closely with founding team members from OpenAI, DeepMind, and other leaders in AI and engineering. The company is mission-driven, backed by $23M in seed funding from Radical Ventures, and focused on deploying Archie across engineering teams in industrial companies worldwide.
- • In this role, you will deepen your expertise in AI evaluation, test system design, and CI/CD integration while contributing to a pioneering effort in engineering AGI. You’ll gain experience collaborating with multidisciplinary stakeholders, shaping evaluation frameworks for next-gen AI systems, and operating in a fast-paced startup environment that values ownership, intellectual excellence, and shipping discipline.
🎯 Requirements
- • Experience in constructing comprehensive test suites for software and/or AI systems, including coordinating the contributions of others
- • Experience designing metrics to evaluate systems and visualize their performance, including differences across successive generations
- • Proficiency in Python programming, complex modules and modern software development tools and practices (Git, CI/CD, etc.)
- • Good communication skills with a variety of stakeholders (AI researchers, domain experts, application developers)
- • Ability to thrive in a fast-paced, dynamic startup environment
🏖️ Benefits
- • Healthcare, dental, and vision insurance
- • 401k with employer matching
- • Unlimited PTO
- • Significant equity component as part of compensation
- • Opportunity to work remotely (US or Canada) with periodic in-person co-working sessions in San Mateo
Skills & Technologies
About P1 AI Inc.
P1 AI is a software company that builds an AI-powered platform for enterprise revenue teams. Its product integrates data ingestion, predictive analytics, and workflow automation to help sales and marketing organizations prioritize leads, forecast revenue, and personalize outreach at scale. The system continuously learns from CRM, email, and engagement data to surface next-best actions and optimize pipeline performance for B2B companies.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities

United Services Automobile Association
1 month ago

Hangar Aviation Technologies, Inc.
1 month ago

