Castai Group Inc. logo

Senior ML Engineer - Kimchi (LLM Inference Optimization)

Job Overview

Location

France

Job Type

Full-time

Category

Software Engineering

Date Posted

May 25, 2026

Full Job Description

đź“‹ Description

  • • Senior ML Engineer role focused on optimizing LLM inference performance for Cast AI's Kimchi platform, directly impacting customer experience and company margins through throughput, latency, and KV cache utilization improvements.
  • • Day-to-day responsibilities include tuning kernels, optimizing quantization schemes, improving scheduler efficiency, profiling bottlenecks in TTFT and TPOT, enhancing KV cache utilization via prefix caching and eviction policies, reducing cold starts, scaling distributed inference across nodes, and setting technical direction through experimentation and reproducible benchmarks.
  • • Cast AI is a unicorn-valued automation platform headquartered in Miami with a global team across 34 countries, specializing in autonomous cloud-native and AI infrastructure optimization at scale, serving over 2,100 enterprise clients including HuggingFace, BMW, and Cisco.
  • • The role offers high autonomy and impact, enabling the engineer to lead technical inference optimization strategy, drive measurable performance gains, and contribute to both customer value and company profitability through data-driven infrastructure innovation.

🎯 Requirements

  • • 5+ years building real ML systems with depth in inference or training infrastructure
  • • Strong Python for production services, not just scripts
  • • Hands-on experience with at least one of vLLM, SGLang, or TensorRT-LLM and understanding of inference engine performance on GPUs
  • • Fluency with quantization tradeoffs, including measured quality regressions
  • • Comfort with distributed systems: collective communication, sharding, and multi-GPU/node failure modes
  • • Bias toward measurement: instrumenting before optimizing and distinguishing real wins from benchmark artifacts
  • • Self-direction and comfort with wide technical mandate

🏖️ Benefits

  • • Competitive salary based on experience
  • • Flexible, remote-first global work environment
  • • Equity options
  • • Learning budget for professional development including access to international conferences and courses
  • • 10% time for personal projects or self-improvement
  • • Equipment budget and extra days off for work-life balance

Skills & Technologies

Python
Node.js
PostgreSQL
AWS
Azure
Data Science
Senior
Remote

Ready to Apply?

You will be redirected to an external site to apply.

Castai Group Inc. logo
Castai Group Inc.
Visit Website

About Castai Group Inc.

Castai Group Inc. provides specialized investment and strategic advisory services to middle-market companies across North America. The firm focuses on mergers and acquisitions, private placements, restructuring, and growth capital transactions for businesses in manufacturing, consumer goods, and business services sectors. Headquartered in New York, Castai operates a lean, senior-led model that emphasizes direct principal involvement, rigorous financial analysis, and long-term client partnerships. Its principals have executed more than 200 transactions totaling over $10 billion in aggregate value since inception.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

Hangar Aviation Technologies, Inc. logo

Hangar Aviation Technologies, Inc.

South Africa - Cape Town
Contract
Expires Jul 25, 2026
Go
Senior
Remote

20 hours ago

Apply
Alberta; British Columbia; Manitoba; Nova Scotia; Ontario; Quebec
Full-time
Expires Jul 5, 2026
MongoDB
AWS
Azure
+3 more

22 days ago

Apply
Dublin
Full-time
Expires Jun 13, 2026
JavaScript
TypeScript
Java
+4 more

1 month ago

Apply
Nexus Cognitive LLC logo

Nexus Cognitive LLC

Charlotte, NC
Full-time
Expires Jul 25, 2026
Kubernetes
Kafka
Apache Spark
+1 more

20 hours ago

Apply