This job has expired

This position was posted on December 13, 2025 and is likely no longer accepting applications. We've kept it here for historical reference. Check out the similar jobs below!

Roboflow, Inc. logo

Machine Learning Engineer - Deployments Team

Job Overview

Location

Remote

Job Type

Full-time

Category

Software Engineering

Date Posted

December 13, 2025

Full Job Description

đź“‹ Description

  • • Own the end-to-end lifecycle of Roboflow’s model-deployment stack, shipping code that moves computer-vision models from training notebooks to production inference in milliseconds across cloud GPUs, ARM edge devices, and browser WASM runtimes.
  • • Architect and harden the next generation of Roboflow Inference—our open-source runtime—adding support for new model families (YOLOv9, SAM-2, RT-DETR), quantization schemes (INT8, FP16, sparsity), and hardware accelerators (NVIDIA Jetson, Intel NPU, Apple Neural Engine).
  • • Build deterministic, zero-downtime release pipelines that push updated containers to 10k+ customer endpoints nightly; automate canary analysis, blue-green rollbacks, and A/B latency experiments so every deploy is boring and safe.
  • • Profile and shave milliseconds off cold-start latency and memory footprint; squeeze 30 % more FPS from edge devices by rewriting hot paths in Rust or CUDA kernels while keeping the Python UX delightful.
  • • Design multi-tenant autoscaling policies that spin up GPU nodes in under 60 seconds and spin them down just as fast, cutting cloud spend without ever dropping a customer’s traffic spike.
  • • Partner with Product, Support, and Solutions Engineering to turn one-off customer hacks into reusable platform features—e.g., package a customer’s custom post-processing pipeline into a pluggable Inference plugin that 200 other teams can import with one line.
  • • Contribute to our open-source repos (Inference, Autodistill, supervision) weekly: review PRs, triage issues, write docs, and record demos that help 1 M+ developers succeed with computer vision.
  • • Run weekly “office hours” for internal teams and external power users, distilling complex deployment gotchas into runbooks and sample repos that reduce time-to-value from days to minutes.
  • • Instrument everything—Prometheus, OpenTelemetry, custom GPU metrics—so anomalies surface before customers notice; wake up rarely, but when you do, you fix root causes, not symptoms.
  • • Champion security best practices: sign images, scan for CVEs, rotate secrets, and ensure SOC-2 compliance without slowing delivery velocity.
  • • Experiment fearlessly with bleeding-edge tech (WebGPU, TensorRT-LLM, LoRA-adapters) in 20 % time; spin the winners into default product features and sunset the rest cleanly.
  • • Mentor junior engineers through pair programming and design reviews; level up the entire team’s ability to ship high-quality ML systems.
  • • Within 30 days, lead a full release cycle, merge your first impactful PR, and identify the roadmap area you’ll own next.
  • • Within 60 days, solve a gnarly customer performance issue and shepherd a cross-team initiative from prototype to early-adopter rollout.
  • • Within 90 days, become the go-to expert on one slice of the deployment stack and kick off a mission-critical initiative that shapes Roboflow’s next chapter.

🎯 Requirements

  • • 5+ years shipping production-grade ML systems, including containerized model serving at scale.
  • • Deep, hands-on experience with PyTorch, TensorFlow, ONNX, TensorRT, or equivalent frameworks.
  • • Proficiency in image/video processing libraries (OpenCV, Pillow, PyAV, DeepStream) and streaming protocols (RTSP, WebRTC).
  • • Strong computer-science fundamentals: concurrency, distributed systems, and low-level performance tuning.
  • • Demonstrated ability to design scalable, observable, and secure architectures in cloud-native environments.

🏖️ Benefits

  • • $163 k–$182.5 k base salary, reviewed every six months to stay market-competitive.
  • • $4 k annual travel stipend—fly anywhere to cowork with teammates.
  • • $350 monthly productivity stipend for home office or co-working upgrades.
  • • 100 % health-insurance coverage for you, partner, and family.
  • • Unlimited PTO with a 2-week minimum; 12 weeks fully paid parental leave.

Skills & Technologies

GitHub
TensorFlow
PyTorch
Remote

Ready to Apply?

You will be redirected to an external site to apply.

Roboflow, Inc. logo
Roboflow, Inc.
Visit Website

About Roboflow, Inc.

Roboflow provides a cloud-based platform for computer vision teams to manage datasets, annotate images and video, train custom models, and deploy them to edge devices and production APIs. The service automates data preprocessing, augmentation, version control, and performance monitoring across YOLO, TensorFlow, PyTorch, and other frameworks, enabling developers and enterprises to accelerate vision projects from prototype to scalable applications without building infrastructure from scratch.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

Bengaluru
Full-time
Expires May 7, 2026
Design
Onsite

1 month ago

Apply
Boston - Remote
Full-time
Expires May 6, 2026
Python
GCP
Kubernetes
+4 more

2 months ago

Apply
United States
Full-time
Expires May 27, 2026
Python
JavaScript
TypeScript
+5 more

25 days ago

Apply
❌ EXPIRED
Sao Palo
Full-time
Expired Apr 1, 2026
Go
Product Management
Remote

3 months ago

Apply