ReflectionAI Inc. logo

Member of Technical Staff - Alignment Lead

Job Overview

Location

Remote

Job Type

Full-time

Category

Machine Learning Engineer

Date Posted

February 17, 2026

Full Job Description

đź“‹ Description

  • • As a Member of Technical Staff - Alignment Lead at ReflectionAI Inc., you will be at the forefront of developing open superintelligence, driving the critical alignment stack to ensure our advanced AI models are not only powerful but also safe, factual, and aligned with human intent. This is a unique opportunity to shape the future of AI by contributing to a mission-driven company that believes in making cutting-edge intelligence accessible to everyone.
  • • You will own and lead the entire alignment process, encompassing instruction tuning, Reinforcement Learning from Human Feedback (RLHF), and Reinforcement Learning from AI Feedback (RLAIF). Your expertise will be crucial in guiding our models towards achieving high factual accuracy and demonstrating robust instruction following capabilities, ensuring they perform as intended across a wide range of tasks.
  • • A core aspect of this role involves pioneering research into next-generation reward models and optimization objectives. You will design and implement novel approaches that significantly enhance our models' performance based on human preferences, pushing the boundaries of what's possible in AI alignment.
  • • Curating and generating high-quality training data is paramount. You will be responsible for sourcing and developing datasets that address complex reasoning challenges and behavioral gaps in our models. This includes designing and implementing synthetic data pipelines to fill these critical areas, ensuring comprehensive model development.
  • • You will optimize large-scale Reinforcement Learning (RL) pipelines, focusing on stability and efficiency. This optimization is key to enabling rapid iteration cycles, allowing us to quickly test and implement model improvements and innovations.
  • • Close collaboration is essential. You will work hand-in-hand with our pre-training and evaluation teams to establish tight feedback loops. This integrated approach ensures that insights from alignment research are effectively translated into generalizable gains across our foundational models.
  • • This role demands a deep technical understanding of alignment methodologies such as Proximal Policy Optimization (PPO), Direct Preference Optimization (DPO), and rejection sampling. You will apply this knowledge to large-scale models, navigating the complexities of distributed systems and large codebases.
  • • You will be instrumental in improving model behavior through innovative data strategies, sophisticated reward modeling, and advanced RL techniques. Your contributions will directly impact the quality and reliability of our AI systems.
  • • We are looking for individuals who can demonstrate a track record of owning ambitious research or engineering agendas that have resulted in measurable model improvements. Your ability to drive projects from conception to successful implementation will be highly valued.
  • • The ideal candidate thrives in a fast-paced, high-agency startup environment where a bias toward action is essential. You should be passionate about advancing the frontier of artificial intelligence and contributing to a company that is building from the ground up.
  • • Joining ReflectionAI means being part of a small, talent-dense team dedicated to building open foundational models. You will have a significant influence on our company's direction and the evolution of open AI.
  • • We are committed to enabling you to do the most impactful work of your career, supported by comprehensive benefits and a culture that values your well-being and that of your loved ones.

🎯 Requirements

  • • Graduate degree (MS or PhD) in Computer Science, Machine Learning, or a related quantitative discipline.
  • • Deep technical command of alignment methodologies (e.g., PPO, DPO, rejection sampling) and proven experience scaling these techniques to large language models.
  • • Strong software engineering skills, with demonstrated experience diving into complex ML codebases and distributed systems.
  • • Prior experience in improving model behavior through data curation, reward modeling, or reinforcement learning techniques.
  • • Evidence of owning ambitious research or engineering projects that resulted in measurable model improvements.

🏖️ Benefits

  • • Top-tier compensation, including salary and equity designed to attract and retain global talent.
  • • Comprehensive health and wellness benefits: medical, dental, vision, life, and disability insurance.
  • • Fully paid parental leave for all new parents, including support for adoptive and surrogate journeys, and financial assistance for family planning.
  • • Generous paid time off (PTO) and relocation support.
  • • Opportunities for team connection through provided daily lunches and dinners, regular off-sites, and team celebrations.

Skills & Technologies

Senior
Remote
Degree Required

Ready to Apply?

You will be redirected to an external site to apply.

ReflectionAI Inc. logo
ReflectionAI Inc.
Visit Website

About ReflectionAI Inc.

ReflectionAI builds autonomous AI agents for enterprise process automation. The platform lets organizations create, deploy, and manage software agents that observe workflows, make decisions, and act across internal systems. Using reinforcement learning and large language models, agents learn from human guidance and adapt to changing environments. Customers use the technology for customer support triage, IT operations, compliance monitoring, and sales process automation, reducing repetitive manual tasks. The company offers cloud-hosted and on-premise deployments, role-based access controls, audit trails, and integrations with common business applications including Salesforce, ServiceNow, Jira, and Slack.

Similar Opportunities

Aim Technologies, Inc. logo

Aim Technologies, Inc.

Remote
Full-time
Expires Apr 23, 2026
Python
Docker
Kubernetes
+4 more

2 days ago

Apply
Remote
Full-time
Expires Apr 13, 2026
GitHub
Remote
Degree Required

12 days ago

Apply
Sydney
Full-time
Expires Mar 10, 2026
Remote

2 months ago

Apply
Remote
Full-time
Expires Apr 13, 2026
Python
Go
AWS
+4 more

12 days ago

Apply