Reddit Inc. logo

Senior Machine Learning Engineer, ML Training Platform

Job Overview

Location

Remote - United States

Job Type

Full-time

Category

Software Engineering

Date Posted

May 14, 2026

Full Job Description

📋 Description

  • As a Senior Machine Learning Engineer on Reddit's ML Training Platform team, you will architect, implement, and maintain foundational ML infrastructure that powers recommendations, content discovery, and user quantification, directly impacting Growth, Ads, Feeds, and Core ML teams.
  • You will lead the building, testing, and maintenance of ML training infrastructure, design and optimize large-scale ML workflows, evolve the MLE experience through self-service GPU environments and on-demand training, write custom Kubernetes Controllers and Operators for Jupyter workspaces and ML jobs, ensure efficient GPU access via collaboration with compute teams, and improve developer experience by reducing friction in the Idea-to-Prototype loop through user research and standardized environments.
  • The Machine Learning Platform team at Reddit owns the infrastructure powering recommendations, content discovery, and quantification, serving over 126 million daily active unique visitors across 100,000+ active communities, with a mission to bring community and belonging to everyone in the world.
  • You will deepen your expertise in Kubernetes operators, GPU orchestration, distributed training frameworks (Ray, Kubernetes), and cloud-based ML platforms (AWS, GCP, Vertex AI, SageMaker), while advocating for platform users and shaping the ML development lifecycle through scalable, reliable, and performant systems.

🎯 Requirements

  • 5+ years of software engineering experience with a focus on Platform Engineering, ML Infrastructure, or Backend Systems
  • Deep Kubernetes expertise including CRDs, Controllers, and the Operator pattern beyond basic pod deployment
  • Proficiency in Python for ML ecosystem and Go for Kubernetes controllers/infrastructure tooling
  • Hands-on experience with CUDA environments, GPU virtualization/containerization within Kubernetes
  • Familiarity with managed ML offerings (Vertex AI, SageMaker) and building custom ML components in AWS and/or GCP
  • Experience with distributed training frameworks including Ray and Kubernetes

🏖️ Benefits

  • Comprehensive Healthcare Benefits and Income Replacement Programs
  • 401k Match
  • Family Planning Support
  • Gender-Affirming Care
  • Mental Health & Coaching Benefits
  • Flexible Vacation & Reddit Global Days off
  • Generous paid Parental Leave
  • Paid Volunteer time off

Skills & Technologies

Python
AWS
GCP
Docker
Kubernetes
Senior
Remote

Ready to Apply?

You will be redirected to an external site to apply.

Reddit Inc. logo
Reddit Inc.
Visit Website

About Reddit Inc.

Reddit is a social media platform where users submit, vote, and comment on content organized into topic-based communities called subreddits. Founded in 2005, it offers forums for news, hobbies, advice, and discussion, enabling real-time conversations and content ranking through upvotes and downvotes. With millions of daily active users globally, Reddit hosts diverse communities moderated by volunteers, supports multimedia posts, and provides advertising and premium membership options. The platform emphasizes user anonymity, community governance, and crowdsourced information, making it a hub for niche interests, viral content, and public discourse. Reddit went public in 2024 and is headquartered in San Francisco.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

Full Remote
Contract
Expires Jul 5, 2026
Remote

10 days ago

Apply
San Francisco, California
Full-time
Expires Jul 5, 2026
Python
JavaScript
Ruby
+3 more

10 days ago

Apply
Singapore
Full-time
Expires Jun 2, 2026
Remote

1 month ago

Apply
❌ EXPIRED
Remote
Full-time
Expired Apr 13, 2026
Remote

3 months ago

Apply