
Job Overview
Location
United States
Job Type
Full-time
Category
DevOps & SysAdmin
Date Posted
March 24, 2026
Full Job Description
đź“‹ Description
- • This Senior DevOps / SRE Engineer role is mission-critical for a confidential client at the forefront of decentralized finance and AI, where infrastructure reliability directly safeguards real-time financial positions managed by autonomous AI trading agents on the Hyperliquid network. The engineer will ensure zero-downtime operations for systems where failure means unprotected capital, not just a service error.
- • Day-to-day responsibilities include building and maintaining infrastructure for dozens of concurrent AI agents, managing complex cron schedules, state files, and trailing stop processes; deploying and orchestrating agent environments with workspace persistence and isolated session management; designing CI/CD pipelines to ship trading skills and plugins without interrupting live trading; executing zero-downtime deployment strategies (blue/green, canary) to protect active financial positions; establishing comprehensive observability via metrics, logs, and traces to detect failures before financial loss; operating and scaling Kubernetes (EKS), Redis, Postgres, ClickHouse, and Kafka; maintaining blockchain node infrastructure and stable connectivity to exchange APIs and on-chain systems; and leading incident response, on-call practices, debugging, mitigation, and post-mortems to improve long-term platform reliability.
- • The team operates at the cutting edge of AI and DeFi, building infrastructure for autonomous agents that manage real-time financial workloads in a high-stakes, innovation-driven environment. The company values technical ownership, engineering excellence, and resilience, fostering a culture where deep expertise in systems reliability directly enables breakthroughs in autonomous trading technology.
- • In this role, the engineer will deepen expertise in zero-downtime systems for financial AI, gain hands-on experience with blockchain-integrated infrastructure, and become a leader in SRE practices for high-frequency, latency-sensitive environments. They will have the opportunity to shape the reliability foundation of a novel category of software—autonomous AI agents—while working with advanced tools like Terraform, Helm, Prometheus, Grafana, and MCP in a real-world, high-impact setting.
🎯 Requirements
- • Extensive experience in DevOps, SRE, or Infrastructure Engineering, preferably in a startup environment where systems were built from the ground up.
- • Proven expertise in deploying, scaling, and debugging production workloads on AWS EKS, with strong hands-on experience in Docker, Helm, and containerization best practices.
- • Proficiency with Infrastructure as Code tools including Terraform and Ansible, along with experience operating production-grade data and messaging systems such as Redis, Postgres/RDS, ClickHouse, and Kafka.
- • Strong background in observability tooling (Prometheus, Grafana, Datadog, Loki, OpenTelemetry) and ability to debug across Python, Node.js, and Go in distributed systems.
- • Understanding of real-time systems with financial consequences, familiarity with blockchain node infrastructure, exchange APIs, and on-chain monitoring, plus experience managing secrets, access controls, and defining SLOs in secure, high-availability environments.
🏖️ Benefits
- • Opportunity to build infrastructure for a pioneering category of software: Autonomous AI Agents operating at the intersection of DeFi and AI.
- • High-autonomy work environment emphasizing engineering excellence, technical ownership, and innovation in resilient systems.
- • Competitive compensation package ranging from $120K to $150K, commensurate with senior-level experience and expertise.
- • Remote-first or flexible working arrangements, allowing collaboration across US-based teams aligned with GMT timezones.
Skills & Technologies
About MLabs
MLabs is a technology company specializing in the development and implementation of advanced laboratory automation solutions. They focus on creating intelligent systems that streamline complex laboratory workflows, enhance data accuracy, and improve overall efficiency for research and development, quality control, and clinical diagnostics. Their offerings often include robotics, AI-driven software, and integrated hardware designed to automate tasks such as sample handling, analysis, and reporting. MLabs serves a diverse range of industries including pharmaceuticals, biotechnology, and healthcare, aiming to accelerate scientific discovery and improve patient outcomes through cutting-edge automation.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities
17 days ago

Pragmatike Soluciones TecnolĂłgicas S.L.
15 days ago

