
Job Overview
Location
SF Hybrid
Job Type
Full-time
Category
Machine Learning Engineer
Date Posted
May 6, 2026
Full Job Description
đź“‹ Description
- • As an AI Engineer - Model Performance at Fathom Technologies Inc., you will own the speed, cost, and reliability of the company's model inference stack and build fine-tuning infrastructure to accelerate the AI team's workflow, directly impacting the performance of an AI meeting assistant used by millions.
- • Day to day, you will benchmark quantization strategies across GPU families, evaluate serving frameworks like vLLM and SGLang with speculative decoding, build repeatable fine-tuning pipelines from JSONL datasets to deployable models, optimize GPU spend by selecting appropriate hardware for latency-sensitive vs batch workloads, and debug production inference issues such as quality regressions from framework updates or audio pipeline bugs.
- • Fathom Technologies Inc. is a small, focused team building an AI-powered meeting assistant that eliminates note-taking overhead by capturing, summarizing, and organizing call moments, with proven traction including #1 rankings on HubSpot, G2, and Product Hunt, and rapid growth in usage and revenue.
- • In this role, you will gain deep expertise in LLM serving optimization, fine-tuning infrastructure, and GPU performance analysis while enabling faster experimentation and deployment across the AI team, directly contributing to product reliability and cost efficiency at scale.
🎯 Requirements
- • Deep experience with LLM serving frameworks (vLLM, SGLang, TensorRT-LLM, or similar), including tuning attention backends, scheduling strategies, CUDA graph warmup, and prefix caching
- • Hands-on quantization experience with understanding of weight vs activation quantization, per-channel vs per-tensor scaling, and trade-offs of dynamic quantization
- • Production fine-tuning experience with LoRA/QLoRA SFT and familiarity with frameworks like ms-swift, Axolotl, or torchtune, including data formatting, learning rate schedules, and diagnosing training failures
- • Strong Python skills for writing serving infrastructure, benchmarking harnesses, and training pipelines
- • Comfort with GPU profiling and performance analysis to identify bottlenecks in compute, memory bandwidth, or scheduling
🏖️ Benefits
- • Opportunity to shape the foundational software services of a growing company
- • Dynamic and collaborative engineering team
- • Competitive compensation and benefits
- • Supportive environment that encourages innovation and personal growth
- • Fully remote work with async-first communication (Slack, Notion, Loom)
Skills & Technologies
See exactly how your profile matches this role — strengths, skill gaps, and what to do about them.
About Fathom Technologies Inc.
Fathom Technologies provides AI meeting recording and transcription software that automatically captures, summarizes, and indexes Zoom, Google Meet, and Microsoft Teams calls. The platform generates searchable transcripts, highlights key moments, and delivers concise summaries, enabling teams to revisit discussions without re-watching entire recordings. Features include secure cloud storage, CRM integrations, real-time note-taking, and role-based sharing controls. Targeting sales, customer success, and remote teams, Fathom offers free and paid tiers, emphasizing privacy compliance and rapid deployment.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities

Caylent, Inc.
2 months ago
25 days ago

Qualysoft GmbH
2 months ago

Heidi Health Pty Ltd
3 months ago
