BaseTen Inc. logo

Engineering Manager, Cloud Platform

Job Overview

Location

San Francisco

Job Type

Full-time

Category

Engineering Manager

Date Posted

May 19, 2026

Full Job Description

đź“‹ Description

  • • Manage a team of cloud platform engineers responsible for building scalable, reliable, and efficient infrastructure to support AI model deployment and inference at production scale.
  • • Recruit, hire, and grow high-performing cloud platform engineers through direct oversight, regular 1:1s, career development planning, and performance feedback.
  • • Set clear performance expectations and foster a culture of ownership, accountability, and operational excellence across the team.
  • • Lead day-to-day technical decisions through design reviews, code reviews, and architectural discussions, ensuring alignment with infrastructure roadmap and business priorities.
  • • Translate the infrastructure roadmap into clear team priorities and milestones, then hold the team accountable for execution and delivery.
  • • Establish and enforce standards for reliability, performance, and operational excellence, ensuring teams own projects end-to-end from specification to production.
  • • Exercise and encourage sound judgment on tooling and architectural tradeoffs, prioritizing simplicity and avoiding unnecessary complexity.
  • • Partner with product and engineering leadership to align infrastructure initiatives with company-wide goals, clearly communicating team capacity, progress, and technical constraints.
  • • Serve as the primary escalation point for major production incidents, driving rapid resolution, facilitating post-mortems, and ensuring lessons are integrated into infrastructure improvements.
  • • Maintain close alignment with users of Baseten’s platform—including AI companies like Cursor, Notion, and Abridge—to understand operational challenges and translate user feedback into infrastructure enhancements.
  • • Promote a culture of continuous improvement by encouraging learning from incidents, adopting best practices, and iterating on processes and tools.
  • • Engage credibly in technical discussions and code reviews, leveraging hands-on experience with Kubernetes, infrastructure-as-code, and CI/CD systems to guide team decisions.
  • • Ensure the team is connected to the mission of enabling AI companies to operationalize machine learning models efficiently and reliably.
  • • Support the development of on-call and incident response protocols, ensuring sustainable engineering practices and minimizing burnout.
  • • Represent the Cloud Platform team’s technical needs and progress to executive leadership with clarity and authority.
  • • Drive adoption of observability, monitoring, and automation tools to reduce toil and increase system resilience.
  • • Coach engineers to develop their own technical judgment and ownership, empowering them to lead projects independently while maintaining team cohesion.
  • • Balance short-term operational needs with long-term infrastructure strategy, ensuring scalable systems evolve in line with the company’s rapid growth.
  • • Maintain strong written and verbal communication skills to influence cross-functional teams without direct authority and ensure alignment across engineering and product organizations.
  • • Stay connected to industry trends in cloud infrastructure and AI deployment to inform team direction and technology choices.
  • • Ensure infrastructure work directly supports the needs of ML startups using Baseten’s platform to bring cutting-edge models into production.

🎯 Requirements

  • • Proven experience directly managing engineers in a cloud infrastructure, platform engineering, or SRE context (not managing managers).
  • • Extensive hands-on experience with Kubernetes, with the ability to engage credibly in architectural discussions and code reviews.
  • • Strong background building and maintaining scalable, production infrastructure.
  • • Familiarity with infrastructure-as-code tools (e.g., Terraform, CloudFormation, Pulumi) and CI/CD tooling (e.g., GitHub Actions, GitLab CI, CircleCI, Jenkins).
  • • Demonstrated ability to drive projects end-to-end — from specification to execution — and coach your team to do the same.
  • • Strong written and verbal communication skills; able to influence across teams without direct authority.
  • • Track record of recruiting and growing engineering talent.

🏖️ Benefits

  • • Competitive compensation, including meaningful equity.
  • • 100% coverage of medical, dental, and vision insurance for employee and dependents.
  • • Flexible PTO policy including company-wide Winter Break (offices closed from Christmas Eve to New Year's Day).
  • • Paid parental leave.
  • • Fertility and family-building stipend through Carrot.
  • • Company-facilitated 401(k).
  • • Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.

Skills & Technologies

Kubernetes
Terraform
Jenkins
GitLab
GitHub
Onsite

Ready to Apply?

You will be redirected to an external site to apply.

BaseTen Inc. logo
BaseTen Inc.
Visit Website

About BaseTen Inc.

BaseTen provides a serverless, GPU-accelerated platform that lets machine-learning teams deploy, scale and monitor custom models behind autoscaling inference endpoints. The service abstracts infrastructure management, supports PyTorch, TensorFlow and Hugging Face artifacts, and offers built-in observability, A/B testing and fine-tuning. Customers integrate via REST or GraphQL APIs and pay only for compute used. Founded in 2019 and headquartered in San Francisco, BaseTen targets data scientists and product teams seeking production-grade ML serving without Kubernetes complexity.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

Expired
Dubai
Full-time
Expired Apr 28, 2026
Onsite

3 months ago

Apply
Expired
Argentina (Remote)
Full-time
Expired Nov 22, 2025
JavaScript
TypeScript
Java
+4 more

9 months ago

Apply
Expired
Buenos Aires
Full-time
Expired Jun 6, 2026
Python
JavaScript
Java
+4 more

2 months ago

Apply
Expired
Berlin, Germany
Full-time
Expired May 10, 2026
Go
Senior
Remote

3 months ago

Apply