Research Engineer - Environments, Data and Post-Training

Mercor Inc.

Job Overview

Location

San Francisco

Job Type

Full-time

Full Job Description

📋 Description

• Mercor is at the forefront of defining the future of work by providing essential human intelligence to advance Artificial Intelligence development. We partner with leading AI labs and enterprises, leveraging a vast network of over 30,000 experts who collectively earn more than $2 million daily. Our mission is to create a new category of work where human expertise directly powers AI advancement, requiring a dedicated, ambitious, and fast-paced team. You will collaborate with researchers, operators, and AI companies shaping the systems that are redefining society. As a profitable Series C company valued at $10 billion, we foster an in-person work environment five days a week at our San Francisco headquarters, driving innovation and impact.
• As a Research Engineer at Mercor, you will operate at the critical intersection of engineering and applied AI research, making significant contributions to post-training, Reinforcement Learning from Human Feedback (RLHF), synthetic data generation, and large-scale evaluation workflows. Your work will directly influence the training of frontier language models, enabling them to master complex tasks such as tool use, agentic behavior, and real-world reasoning within production environments. You will be instrumental in shaping reward mechanisms, conducting rigorous post-training experiments, and developing scalable systems designed to enhance model performance. Furthermore, you will play a key role in designing and evaluating datasets, architecting scalable data augmentation pipelines, and creating sophisticated rubrics and evaluators that push the boundaries of what Large Language Models (LLMs) can learn and achieve.
• Your day-to-day responsibilities will encompass a wide range of critical activities. You will actively work on post-training and RLHF pipelines, meticulously analyzing how different datasets, reward structures, and training strategies influence overall model performance. This involves designing and executing reward-shaping experiments and implementing algorithmic improvements, such as Proximal Policy Optimization (PPO) variants like GRPO or Direct Preference Optimization (DPO), to enhance LLM capabilities in tool use, agentic behavior, and real-world reasoning. A significant part of your role will involve quantifying data usability, assessing data quality, and measuring the performance uplift achieved on key benchmarks, providing crucial insights for model development.
• You will be responsible for building and maintaining robust data generation and augmentation pipelines, ensuring they can scale effectively to meet the demands of continuous training. This includes developing and refining rubrics, evaluators, and scoring frameworks that serve as guiding principles for training and evaluation decisions, ensuring consistency and rigor. Additionally, you will build and operate sophisticated LLM evaluation systems, benchmarks, and metrics at a large scale, providing comprehensive assessments of model capabilities. Collaboration is key; you will work closely with AI researchers, applied AI teams, and the experts who generate the training data, fostering a synergistic environment. You will operate within a fast-paced, experimental research setting characterized by rapid iteration cycles and a high degree of ownership, contributing to groundbreaking AI advancements.
• The team you will join is at the heart of Mercor's mission, working on cutting-edge AI development. You will be part of a dynamic group of individuals passionate about pushing the boundaries of what AI can do. This role offers an unparalleled opportunity to gain deep, hands-on experience with the latest advancements in LLM training, evaluation, and deployment. You will develop a sophisticated understanding of model behavior, experimental design, and data quality assessment, skills that are highly sought after in the rapidly evolving AI landscape.
• In this role, you can expect to significantly contribute to the development of next-generation AI models, directly impacting their capabilities and real-world applicability. You will gain invaluable experience in a high-growth, well-funded startup environment, working on challenging problems with a direct line of sight to impact. The opportunity to publish research at top-tier conferences, if applicable, and to build a portfolio of impactful work in the AI domain is substantial. This position is ideal for someone who thrives in a challenging, experimental, and collaborative setting, eager to make a tangible difference in the field of artificial intelligence.

🎯 Requirements

• Strong applied research background, with a focus on post-training and/or model evaluation.
• Strong coding proficiency and hands-on experience working with machine learning models.
• Strong understanding of data structures, algorithms, backend systems, and core engineering fundamentals.
• Familiarity with APIs, SQL/NoSQL databases, and cloud platforms.
• Ability to reason deeply about model behavior, experimental results, and data quality.
• Excitement to work in person in San Francisco, five days a week (with optional remote Saturdays), and thrive in a high-intensity, high-ownership environment.

🏖️ Benefits

• Generous equity grant vested over 4 years
• A $10K housing bonus (if you live within 0.5 miles of our office)
• A $1.5K monthly stipend for meals
• Free Equinox membership
• Health insurance

Skills & Technologies

Remote

Degree Required

Ready to Apply?

Apply Externally

You will be redirected to an external site to apply.

Mercor Inc.

Visit Website

About Mercor Inc.

Mercor Inc. operates an AI-driven hiring marketplace that matches companies with global engineering, finance, and operations talent. Its platform uses machine-learning models to screen résumés, conduct automated interviews, and rank candidates, enabling employers to hire pre-vetted contractors or full-time staff within days. Founded in 2023 by former MIT researchers, the San Francisco-based company serves startups and Fortune 500 clients, handling sourcing, payroll, and compliance across 25 countries. Revenue comes from placement fees and ongoing markups on contractor rates, while talent access roles remotely in software development, data science, sales, and product management.

View Company Profile

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.