Astronomer, Inc. logo

Customer Reliability Engineer - Infrastructure

Job Overview

Location

Remote (United States)

Job Type

Full-time

Category

Data Science

Date Posted

June 26, 2026

Full Job Description

đź“‹ Description

  • • Operate, monitor, and maintain the cloud infrastructure and Kubernetes clusters powering Astronomer’s managed Apache Airflow® platform to ensure high availability, predictability, and reliable customer operations.
  • • Respond to customer-reported incidents and system alerts, leading triage, diagnosis, and resolution to prevent recurrence and improve system resilience.
  • • Participate in a rotating on-call schedule that includes weekend coverage to ensure 24/7 support for mission-critical customer environments.
  • • Troubleshoot complex customer environments across diverse cloud providers (AWS, GCP, Azure) and provide tailored guidance to help customers achieve production success.
  • • Build, enhance, and maintain monitoring, alerting, and observability systems to proactively detect and mitigate infrastructure issues before they impact customers.
  • • Develop and automate operational workflows to reduce manual toil, improve efficiency, and ensure consistent execution of daily infrastructure tasks.
  • • Collaborate directly with customers to prioritize issues, meet SLAs, and deliver “white glove” support that enhances their experience with Astronomer’s products.
  • • Provide actionable feedback to product development teams based on real-world customer pain points, usage patterns, and operational challenges.
  • • Contribute to the architecture and design of the platform by identifying reliability gaps and proposing scalable infrastructure improvements.
  • • Enhance and enrich customer-facing documentation to improve self-service capabilities and reduce support overhead.
  • • Work within a fully distributed, remote team to support customers across multiple industries and time zones with varying technical requirements.
  • • Engage with the latest cloud-native technologies and multi-cloud environments to ensure the platform remains at the forefront of DataOps innovation.
  • • Own the end-to-end customer experience for infrastructure-related issues, ensuring timely resolution and continuous improvement in service quality.
  • • Apply Python scripting to automate tasks, analyze logs, and build tools that improve operational efficiency and system reliability.
  • • Maintain deep Linux system knowledge to diagnose and resolve low-level infrastructure issues across distributed systems.
  • • Work with Infrastructure as Code (IaC) practices to manage and version-control cloud resources and Kubernetes configurations.

🎯 Requirements

  • • 5 years of experience operating large, complex cloud infrastructures at scale
  • • 3 years of hands-on experience with Kubernetes
  • • Proven experience managing production distributed systems on at least one major cloud provider (AWS, GCP, or Azure)
  • • Strong Linux system administration and troubleshooting skills
  • • Demonstrated experience handling customer issues, either internally or externally
  • • Strong communication skills for direct customer interaction and cross-team collaboration
  • • DevOps or CI/CD experience
  • • Proficiency in Python scripting

🏖️ Benefits

  • • Estimated total compensation range of $125,000–$130,000, including equity
  • • Comprehensive benefits package
  • • Fully remote work environment
  • • Opportunity to work with cutting-edge DataOps and Apache Airflow® technology

Skills & Technologies

Python
AWS
Azure
GCP
Kubernetes
DevOps
Remote
$125k-130k

Ready to Apply?

You will be redirected to an external site to apply.

AI Job Fit Analysis
Pro

See exactly how your profile matches this role — strengths, skill gaps, and what to do about them.

Astronomer, Inc. logo
Astronomer, Inc.
Visit Website

About Astronomer, Inc.

Astronomer, Inc. provides Apache Airflow as a managed cloud service and enterprise platform. The company maintains the open-source workflow orchestration project, offers commercial support, and delivers a control plane that lets data teams deploy, monitor, and scale directed acyclic graphs across Kubernetes clusters. Its product suite includes Astro, a fully hosted Airflow environment with role-based access, CI/CD hooks, and usage observability. Customers use the software to schedule Python and SQL data pipelines, connect on-premise and cloud databases, and ensure reliable data delivery for analytics and machine-learning workloads. Astronomer is headquartered in Cincinnati, Ohio, and serves global enterprises.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

Expired
Remote - Canada
Full-time
Expired May 15, 2026
Senior
Remote
Degree Required

3 months ago

Expired
Melbourne, Australia
Full-time
Expired May 16, 2026
Python
JavaScript
Java
+3 more

3 months ago

Expired
Honeycomb Inc. logo

Honeycomb Inc.

Remote - United States
Full-time
Expired May 23, 2026
Senior
Remote

3 months ago

Expired
Abacum Technologies, Inc. logo

Abacum Technologies, Inc.

New York City
Full-time
Expired May 20, 2026
Hybrid

3 months ago