Member of Technical Staff - Data Quality Engineer (Post-training)

ReflectionAI Inc.

Job Overview

Location

Remote

Job Type

Full-time

Full Job Description

📋 Description

• Join ReflectionAI Inc. as a Member of Technical Staff - Data Quality Engineer (Post-training) and play a pivotal role in shaping the future of open superintelligence. Our mission is to democratize access to advanced AI, developing cutting-edge open weight models for a diverse range of users, from individuals and agents to enterprises and even nation-states. Our team comprises distinguished AI researchers and seasoned company builders, with alumni from leading institutions like DeepMind, OpenAI, Google Brain, Meta, Character.AI, and Anthropic, underscoring our commitment to pushing the boundaries of artificial intelligence.
• In the rapidly evolving landscape of AI innovation, data has emerged as a cornerstone. Significant advancements in recent years have often stemmed not from novel architectures, but from superior data quality. As an integral member of our Data Team, your primary objective will be to uphold an exceptionally high standard for the data used in training and evaluating our models. This includes ensuring its quality, reliability, and its direct impact on downstream model performance. You will be instrumental in defining how our models excel in critical capabilities such as agentic tool use, long-horizon reasoning, and robust safety alignment.
• Collaborate closely with our world-class post-training research teams. Your work will involve transforming abstract concepts of 'good data' into tangible, quantifiable standards that can be effectively applied across extensive data campaigns. We are actively seeking engineers who possess strong foundational engineering skills coupled with a profound curiosity about data quality and its intricate influence on model behavior. This role offers a unique opportunity to work at the forefront of AI development, directly contributing to the intelligence and safety of foundational models.
• Your responsibilities will encompass owning upstream data quality for LLM post-training and evaluation processes. This involves meticulous analysis of expert-developed datasets and the operationalization of quality standards tailored for reasoning, alignment, and agentic use cases. You will forge close partnerships with our research and post-training teams, translating their high-level requirements into precise, measurable quality signals. Furthermore, you will provide constructive and actionable feedback to external data vendors, ensuring adherence to our stringent quality benchmarks.
• A key aspect of this role is the design, validation, and scaling of automated Quality Assurance (QA) methods. This includes leveraging innovative frameworks like LLM-as-a-Judge to reliably assess data quality across vast data campaigns. You will be responsible for building robust, reusable QA pipelines that consistently deliver high-quality data to our post-training teams, thereby facilitating efficient and effective model training and evaluation. Continuous monitoring and reporting on data quality trends will be essential, driving iterative improvements in our quality standards, processes, and acceptance criteria.
• We are looking for individuals who are not only technically proficient but also possess a keen analytical mindset. You should be adept at identifying subtle failure modes, inconsistencies, and other issues that can compromise data quality. A solid understanding of how data quality influences both supervised fine-tuning (SFT) and reinforcement learning (RL) training, as well as evaluation methodologies, is crucial. Your ability to translate abstract quality concerns into concrete signals, inform critical decisions, and provide clear feedback will be highly valued.
• The ideal candidate will have experience in designing and validating automated quality checks, employing a range of techniques from rule-based systems and statistical methods to advanced model-assisted approaches like LLM-as-a-Judge. Comfort working autonomously, taking ownership of problems from inception to resolution, and collaborating effectively with a diverse group of researchers, engineers, and operations partners is paramount. This role is ideal for someone who thrives in a dynamic, research-driven environment and is passionate about the foundational elements of AI development.
• You will be at the heart of ensuring our AI models are trained on the best possible data, directly impacting their ability to reason, act, and align with human values. This is a unique opportunity to contribute to a mission-driven company at the cutting edge of AI research and development, with the potential to leave a lasting impact on the field.

🎯 Requirements

• Proficiency in Python and experience building ML/LLM workflows, including debugging and writing scalable code.
• Experience designing and validating automated quality checks, including rule-based systems, statistical methods, or model-assisted approaches (e.g., LLM-as-a-Judge).
• Strong engineering fundamentals with experience building data pipelines, QA systems, or evaluation workflows for post-training data and agentic environments.
• Solid understanding of how data quality impacts training (SFT and RL) and evaluation, with the ability to translate quality concerns into concrete signals, decisions, and feedback.

🏖️ Benefits

• Top-tier compensation, including salary and equity designed to attract and retain global talent.
• Comprehensive health and wellness benefits: medical, dental, vision, life, and disability insurance.
• Generous paid time off and a focus on work-life balance.
• Fully paid parental leave for all new parents, with financial support for family planning.

Skills & Technologies

Python

Senior

Remote

Ready to Apply?

Apply Externally

You will be redirected to an external site to apply.

ReflectionAI Inc.

Visit Website

About ReflectionAI Inc.

ReflectionAI builds autonomous AI agents for enterprise process automation. The platform lets organizations create, deploy, and manage software agents that observe workflows, make decisions, and act across internal systems. Using reinforcement learning and large language models, agents learn from human guidance and adapt to changing environments. Customers use the technology for customer support triage, IT operations, compliance monitoring, and sales process automation, reducing repetitive manual tasks. The company offers cloud-hosted and on-premise deployments, role-based access controls, audit trails, and integrations with common business applications including Salesforce, ServiceNow, Jira, and Slack.

View Company Profile