
Job Overview
Location
San Francisco
Job Type
Full-time
Category
Software Engineering
Date Posted
May 6, 2026
Full Job Description
📋 Description
- • As a Distributed LLM Inference Engineer at Anyscale, you will help build systems and optimizations that push the boundaries of performance for large-scale AI inference, enabling developers to scale machine learning applications from laptops to clusters without needing distributed systems expertise.
- • You will iterate quickly with product teams to ship end-to-end batch and online inference solutions for open-source Ray users and Anyscale customers, integrate Ray Data with LLM engines for low-cost, high-scale ML inference, collaborate with the open-source community on tools like vLLM and TensorRT-LLM, and implement state-of-the-art techniques from research and open source.
- • Anyscale is commercializing Ray, a widely adopted open-source framework used by companies like OpenAI, Uber, Spotify, Instacart, and Cruise to power scalable AI applications, and is backed by Andreessen Horowitz, NEA, and Addition with over $250M in funding.
- • In this role, you will deepen your expertise in distributed systems, ML inference at scale, GPU optimization, and open-source collaboration, while contributing to cutting-edge AI infrastructure that enables real-world AI deployment.
🎯 Requirements
- • Familiarity with running ML inference at large scale with high throughput and low latency
- • Familiarity with deep learning and deep learning frameworks (e.g. PyTorch)
- • Solid understanding of distributed systems and ML inference challenges
🏖️ Benefits
- • Stock Options
- • Healthcare plans with premiums covered by Anyscale at 99% for employees and dependents
- • 401k Retirement Plan
- • Education & Wellbeing Stipend
- • Paid Parental Leave
- • Fertility Benefits
- • Paid Time Off
- • Commute reimbursement
- • 100% of in-office meals covered
Skills & Technologies
See exactly how your profile matches this role — strengths, skill gaps, and what to do about them.
About Anyscale Inc.
Anyscale Inc. builds the Ray open-source distributed computing framework and offers a managed platform that lets data scientists and engineers scale machine-learning workloads from laptop to cloud without rewriting code. The company provides serverless infrastructure, observability, and cluster automation so teams can train, tune, and serve models faster. Founded in 2019 by the creators of Ray at UC Berkeley, Anyscale serves Fortune 500 enterprises and AI startups, enabling them to reduce cost and complexity while accelerating production deployment of large-scale AI applications.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities

DoiT International
3 months ago

Ddome Inc.
3 months ago

Stedi, Inc.
4 months ago

DoiT International
3 months ago