BaseTen Inc. logo

Solution Architect (AI/LLM Inference)

Job Overview

Location

San Francisco

Job Type

Full-time

Category

Software Engineering

Date Posted

May 12, 2026

Full Job Description

đź“‹ Description

  • • As a Solution Architect (AI/LLM Inference) at Baseten, you will partner closely with Sales and customers to translate business needs into technical solutions, run technical discovery, and guide repeatable deployments and proofs of value for customers.
  • • You will lead demos and technical scoping, own benchmarking and repeatable deployments across modalities like LLMs, embeddings, image/video generation, and VoiceAI, and drive POC execution by scoping projects, aligning stakeholders, and acting as a technical project manager.
  • • Baseten powers mission-critical inference for leading AI companies such as Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma, and Writer, enabling them to bring cutting-edge models into production through applied AI research, flexible infrastructure, and seamless developer tooling.
  • • You will become a power user of runtimes like vLLM, SGLang, and TRT-LLM, advise on infrastructure tradeoffs (e.g., H100s vs B200s), and help build repeatable "playbook" style deployments while gaining deep exposure to how modern companies adopt AI at scale.

🎯 Requirements

  • • AI/ML background and the ability to credibly discuss AI/ML topics with technical stakeholders
  • • Strong customer-facing communication skills, including the ability to run structured discovery and clarify ambiguous requirements
  • • Technical depth to scope solutions, without needing to write production code
  • • Ability to script and prototype as needed, including comfort "vibe coding" to move quickly in technical workflows

🏖️ Benefits

  • • Competitive compensation, including meaningful equity
  • • 100% coverage of medical, dental, and vision insurance for employee and dependents
  • • Flexible PTO policy including company wide Winter Break (offices closed from Christmas Eve to New Year's Day)
  • • Paid parental leave
  • • Fertility and family-building stipend through Carrot
  • • Company-facilitated 401(k)
  • • Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities

Skills & Technologies

Apache Spark
Senior
Onsite

Ready to Apply?

You will be redirected to an external site to apply.

AI Job Fit Analysis
Pro

See exactly how your profile matches this role — strengths, skill gaps, and what to do about them.

BaseTen Inc. logo
BaseTen Inc.
Visit Website

About BaseTen Inc.

BaseTen provides a serverless, GPU-accelerated platform that lets machine-learning teams deploy, scale and monitor custom models behind autoscaling inference endpoints. The service abstracts infrastructure management, supports PyTorch, TensorFlow and Hugging Face artifacts, and offers built-in observability, A/B testing and fine-tuning. Customers integrate via REST or GraphQL APIs and pay only for compute used. Founded in 2019 and headquartered in San Francisco, BaseTen targets data scientists and product teams seeking production-grade ML serving without Kubernetes complexity.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

Expired
USA - Remote NC
Full-time
Expired May 18, 2026
Express
Senior
Remote
+1 more

4 months ago

Expired
London
Full-time
Expired May 3, 2026
Go
Design
Senior
+1 more

4 months ago

Expired
Remote, United Kingdom
Full-time
Expired May 18, 2026
React
Elasticsearch
AWS
+2 more

4 months ago

Expired
ZA - Remote - South Africa
Full-time
Expired May 11, 2026
JavaScript
Remote
Degree Required

4 months ago