This job has expired

This position was posted on March 18, 2026 and is likely no longer accepting applications. We've kept it here for historical reference. Check out the similar jobs below!

Post-Training Applied Researcher

BaseTen Inc.

Job Overview

Location

San Francisco

Job Type

Full-time

Full Job Description

📋 Description

• The Post-Training Applied Researcher role at BaseTen Inc. is critical to advancing the company’s mission of enabling mission-critical AI inference for leading AI-driven organizations. You will directly influence how open-source models are refined and optimized to outperform proprietary models on specialized, real-world tasks, ensuring that cutting-edge research translates into tangible product impact for customers like Cursor, Notion, and OpenEvidence.
• Your work will bridge the gap between theoretical AI research and production deployment, where your innovations in post-training techniques will be shipped to millions of end users, making this role uniquely impactful in shaping the future of applied AI.
• You will design and execute end-to-end post-training pipelines including SFT, GRPO, DPO, and RLVR, with deep involvement in reward function engineering and synthetic data generation to align model behavior with complex, domain-specific customer needs.
• You will build custom training environments and evaluation harnesses for multi-turn agent workflows involving tool use, sandboxed execution, and agentic reasoning — tailored to high-stakes domains such as healthcare, legal, and code generation — ensuring models perform reliably in real production settings.
• You will collaborate closely with customers to interpret production usage patterns, convert them into training signals, and design reward loops that account for distribution shift, enabling continuous model improvement grounded in real-world data.
• You will run, monitor, and analyze large-scale training experiments, diagnosing subtle failure modes like reward hacking, importance sampling drift, and advantage estimation instabilities, using rigorous statistical and empirical methods to ensure stable and effective learning.
• You will contribute to scientific advancement by publishing findings at top-tier ML conferences (NeurIPS, ICML, ICLR) and enhancing Baseten’s open-source training libraries, establishing thought leadership in RL for LLMs and alignment.
• You will work across the full ML lifecycle — from dataset construction and curriculum design through training, evaluation, and deployment — gaining holistic expertise in closing the training-inference loop with production feedback.
• You will join a rapidly growing, well-funded startup backed by top-tier investors (BOND, IVP, Spark Capital, Greylock, Conviction) and work alongside elite AI engineers and researchers committed to pushing the frontier of accessible, high-performance AI infrastructure.
• You will develop deep expertise in cutting-edge RL techniques for LLMs, including group advantage computation, clipped objectives, and KL penalty design, while cultivating judgment on when to apply rigorous scientific methods versus rapid iteration — a rare and valuable skill set in applied AI.

🎯 Requirements

• Hands-on experience training LLMs with reinforcement learning, including practical understanding of GRPO or PPO beyond recipe-level execution — specifically group advantage computation, clipped objectives, and KL penalty design
• Strong intuition for reward engineering: ability to discern between rewards that drive effective learning and those that lead to reward hacking or exploitation at scale
• Experience building multi-turn agent environments with tool use, extending beyond single-turn QA to include sandboxed execution, tool chaining, and agentic workflows
• Comfort working across the full ML pipeline: dataset construction, training, evaluation, and deployment, with preference for those who have closed a training–inference loop using production data
• Experience with production ML systems, including monitoring, debugging, and iterating on models in live environments

🏖️ Benefits

• Competitive compensation package including meaningful equity stake in a fast-growing Series E startup
• 100% coverage of medical, dental, and vision insurance for employees and their dependents
• Generous PTO policy featuring a company-wide Winter Break (offices closed from Christmas Eve to New Year’s Day)
• Paid parental leave to support work-life balance during major life events
• Company-facilitated 401(k) retirement plan with administrative support
• Exposure to diverse ML startups through customer collaborations, offering unparalleled learning, networking, and insight into real-world AI applications

Skills & Technologies

Apache Spark

Onsite

Ready to Apply?

Apply Externally

You will be redirected to an external site to apply.

BaseTen Inc.

Visit Website

About BaseTen Inc.

BaseTen provides a serverless, GPU-accelerated platform that lets machine-learning teams deploy, scale and monitor custom models behind autoscaling inference endpoints. The service abstracts infrastructure management, supports PyTorch, TensorFlow and Hugging Face artifacts, and offers built-in observability, A/B testing and fine-tuning. Customers integrate via REST or GraphQL APIs and pay only for compute used. Founded in 2019 and headquartered in San Francisco, BaseTen targets data scientists and product teams seeking production-grade ML serving without Kubernetes complexity.

View Company Profile

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.