This job has expired

This position was posted on November 19, 2025 and is likely no longer accepting applications. We've kept it here for historical reference. Check out the similar jobs below!

Adaptive ML SAS logo

Deployment DevOps Engineer

Job Overview

Location

Toronto

Job Type

Full-time

Category

Software Engineering

Date Posted

November 19, 2025

Full Job Description

đź“‹ Description

  • • Architect and own the end-to-end deployment lifecycle of Adaptive Engine, our flagship reinforcement-learning platform, ensuring it can be rolled out on-prem, in any major cloud (AWS, Azure, GCP), or as a managed SaaS with one-click simplicity.
  • • Design and maintain resilient, version-controlled Kubernetes workflows (Helm charts, ArgoCD pipelines, GitOps) that scale from single-GPU proof-of-concepts to thousand-GPU production clusters handling trillions of user-interaction records.
  • • Act as the first line of defense for customer escalations across North-American time zones: join live troubleshooting calls with Fortune-500 clients, reproduce issues in staging, coordinate hot-fixes, and feed learnings back into the product roadmap.
  • • Build hardened, auditable deployment blueprints that satisfy enterprise security and compliance mandates—think OIDC/SSO integration, network micro-segmentation, WAF rules, secrets rotation, and automated CIS-hardened images.
  • • Partner with Sales and Customer Success to deliver pre-sales demos, POC environments, and post-sales production cut-overs; you’ll often be the technical face of Adaptive ML during critical onboarding milestones.
  • • Establish 24Ă—7 on-call rotation, incident-response runbooks, and observability dashboards (Prometheus, Grafana, Loki) that keep MTTR under 15 minutes for P1 issues and provide clear RCA reports to executives.
  • • Optimize data pipelines for multi-terabyte Postgres clusters and object storage, ensuring sub-second query latencies for real-time personalization while keeping storage costs predictable.
  • • Champion Infrastructure-as-Code best practices (Terraform, Pulumi) so every environment—from dev to prod—can be recreated identically in minutes, not days.
  • • Contribute code (Rust preferred) to internal tooling that automates certificate issuance, GPU driver updates, and chaos-engineering experiments that validate fault tolerance.
  • • Influence the product roadmap by translating field feedback into concrete DevOps features: e.g., self-healing clusters, zero-downtime upgrades, or cost-aware auto-scaling policies.
  • • Foster a culture of blameless post-mortems, continuous learning, and documentation that allows new team members to deploy safely on day one.
  • • Mentor junior engineers and create reusable modules that accelerate future customer deployments, turning one-off fixes into scalable product capabilities.

Skills & Technologies

Rust
PostgreSQL
Kubernetes
Linux
DevOps
Remote

Ready to Apply?

You will be redirected to an external site to apply.

Adaptive ML SAS logo
Adaptive ML SAS
Visit Website

About Adaptive ML SAS

Paris-based startup developing a platform that lets enterprises fine-tune and deploy large language models on their own data. The system combines reinforcement learning from human feedback, retrieval-augmented generation and automated evaluation to create specialized, privacy-preserving models that run efficiently on private clouds or on-premise hardware, targeting sectors such as finance, healthcare and legal services.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

New York, NY
Full-time
Expires Jun 13, 2026
JavaScript
TypeScript
Go
+4 more

15 days ago

Apply
❌ EXPIRED
Remote
Full-time
Expired Apr 13, 2026
Remote

3 months ago

Apply
❌ EXPIRED
Aquia Inc. logo

Aquia Inc.

Remote
Full-time
Expired Nov 24, 2025
Python
JavaScript
GitHub
+3 more

7 months ago

Apply
Singapore
Full-time
Expires Jun 2, 2026
Remote

26 days ago

Apply