
Job Overview
Location
San Francisco
Job Type
Full-time
Category
Software Engineering
Date Posted
May 12, 2026
Full Job Description
đź“‹ Description
- • As a Solution Architect (AI/LLM Inference) at Baseten, you will partner closely with Sales and customers to translate business needs into technical solutions, run technical discovery, and guide repeatable deployments and proofs of value for customers.
- • You will lead demos and technical scoping, own benchmarking and repeatable deployments across modalities like LLMs, embeddings, image/video generation, and VoiceAI, and drive POC execution by scoping projects, aligning stakeholders, and acting as a technical project manager.
- • Baseten powers mission-critical inference for leading AI companies such as Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma, and Writer, enabling them to bring cutting-edge models into production through applied AI research, flexible infrastructure, and seamless developer tooling.
- • You will become a power user of runtimes like vLLM, SGLang, and TRT-LLM, advise on infrastructure tradeoffs (e.g., H100s vs B200s), and help build repeatable "playbook" style deployments while gaining deep exposure to how modern companies adopt AI at scale.
🎯 Requirements
- • AI/ML background and the ability to credibly discuss AI/ML topics with technical stakeholders
- • Strong customer-facing communication skills, including the ability to run structured discovery and clarify ambiguous requirements
- • Technical depth to scope solutions, without needing to write production code
- • Ability to script and prototype as needed, including comfort "vibe coding" to move quickly in technical workflows
🏖️ Benefits
- • Competitive compensation, including meaningful equity
- • 100% coverage of medical, dental, and vision insurance for employee and dependents
- • Flexible PTO policy including company wide Winter Break (offices closed from Christmas Eve to New Year's Day)
- • Paid parental leave
- • Fertility and family-building stipend through Carrot
- • Company-facilitated 401(k)
- • Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities
Skills & Technologies
About BaseTen Inc.
BaseTen provides a serverless, GPU-accelerated platform that lets machine-learning teams deploy, scale and monitor custom models behind autoscaling inference endpoints. The service abstracts infrastructure management, supports PyTorch, TensorFlow and Hugging Face artifacts, and offers built-in observability, A/B testing and fine-tuning. Customers integrate via REST or GraphQL APIs and pay only for compute used. Founded in 2019 and headquartered in San Francisco, BaseTen targets data scientists and product teams seeking production-grade ML serving without Kubernetes complexity.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities

Circle Internet Financial Limited
2 months ago

MLabs
2 months ago

Harris Computer Systems Corporation
2 months ago

Safeguard Global, Inc.
2 months ago