CodeRabbit, Inc. logo

DevOps Engineer

Job Overview

Location

San Francisco

Job Type

Full-time

Category

DevOps

Date Posted

January 8, 2026

Full Job Description

đź“‹ Description

  • • Own the entire lifecycle of our AI-enabled developer platform’s infrastructure—from design and provisioning to day-two operations and disaster recovery—ensuring 99.9 %+ uptime for thousands of daily code reviews.
  • • Architect and maintain resilient, multi-region Kubernetes clusters on AWS and GCP using Infrastructure-as-Code (Terraform/Pulumi) with automated drift detection and policy guardrails; every pull request triggers a preview environment so the team can test changes in minutes, not hours.
  • • Build and continuously improve CI/CD pipelines (GitHub Actions → Argo CD) that deploy micro-services, ML models, and GPU workloads in under five minutes while enforcing security scans, dependency checks, and performance regression tests.
  • • Instrument end-to-end observability: configure Prometheus, Grafana, Loki, and Datadog dashboards that surface golden signals (latency, traffic, errors, saturation) for both traditional services and GPU-accelerated inference pods, cutting mean-time-to-detect (MTTD) to <2 minutes.
  • • Design cost-aware autoscaling policies (Karpenter, HPA, VPA) that balance GPU availability for large-language-model inference against cloud spend; deliver weekly cost reports and right-sizing recommendations to leadership.
  • • Harden security at every layer: enforce OIDC-based auth, mTLS between services, secrets rotation via Vault, container image signing, and CIS-benchmarked node hardening; run quarterly chaos-engineering drills to validate blast-radius containment.
  • • Partner with applied-AI engineers to optimize model-serving infrastructure (Triton, vLLM, TensorRT) for low-latency code-review feedback, including canary releases and A/B traffic splitting to measure model accuracy vs. performance.
  • • Create self-service tooling that lets backend and ML engineers spin up ephemeral dev environments, run integration tests, and ship features without ever opening a ticket; document everything in runbooks and code so tribal knowledge disappears.
  • • Establish SLOs/SLIs with error budgets and blameless post-mortems; lead incident response, root-cause analysis, and long-term corrective actions that prevent recurrence.
  • • Contribute to open-source DevOps projects and internal platform libraries, turning one-off scripts into reusable modules the broader community can adopt.
  • • Mentor junior engineers through pair-programming and design reviews, fostering a culture where infrastructure is treated as a product and reliability is everyone’s job.
  • • Stay ahead of the curve: evaluate new CNCF projects, GPU orchestrators, and security frameworks, then run proof-of-concepts that keep CodeRabbit on the bleeding edge of developer productivity.

🎯 Requirements

  • • 3–5 years of hands-on DevOps, SRE, or platform engineering experience in a high-growth startup or scale-up environment.
  • • Expert-level proficiency with Kubernetes, Docker, and cloud-native CI/CD stacks (GitHub Actions, Argo CD, or similar).
  • • Deep expertise in at least one major cloud provider (AWS or GCP), including networking, IAM, and cost optimization; Terraform or Pulumi fluency is mandatory.
  • • Proven track record designing observability solutions using Prometheus, Grafana, ELK/Opensearch, or Datadog in large-scale distributed systems.
  • • Solid grasp of cloud security best practices: secrets management, container hardening, network policies, and compliance frameworks (SOC 2, ISO 27001).

🏖️ Benefits

  • • Competitive base salary + meaningful equity in a fast-growing AI startup redefining software development.
  • • Hybrid work culture: collaborate in person in San Francisco 2–3 days per week, with flexibility for remote deep-work days and top-tier home-office stipend.
  • • Annual learning & development budget ($3,000+) plus paid attendance at leading DevOps/KubeCon conferences and certification programs.
  • • Premium health, dental, and vision coverage for you and dependents, plus monthly wellness stipend and mental-health support.

Skills & Technologies

AWS
GCP
Docker
Kubernetes
Terraform
DevOps
Remote

Ready to Apply?

You will be redirected to an external site to apply.

CodeRabbit, Inc. logo
CodeRabbit, Inc.
Visit Website

About CodeRabbit, Inc.

CodeRabbit provides an AI-powered code review platform that integrates with GitHub and GitLab. It automatically analyzes pull requests, identifies bugs, enforces style rules, and suggests improvements in real time. The service supports multiple languages and frameworks, offers customizable policies, and maintains a privacy-focused architecture to keep proprietary code secure.

Similar Opportunities

Sydney, Australia
Full-time
Expires Apr 27, 2026
Remote

10 days ago

Apply
Melbourne, Australia
Full-time
Expires Apr 27, 2026
Remote

10 days ago

Apply
Belgium
Full-time
Expires Apr 27, 2026
Senior
Onsite

10 days ago

Apply
Bulgaria
Full-time
Expires Apr 27, 2026

10 days ago

Apply