
Senior Site Reliability Engineer, Platform & Cloud FinOps (100% Remote - USA Central & EST)
Job Overview
Location
New York, USA
Job Type
Full-time
Category
Software Engineering
Date Posted
March 7, 2026
Full Job Description
đź“‹ Description
- • Hopper Inc. is seeking a Senior Site Reliability Engineer to join our dynamic Cloud FinOps team. This role is 100% remote within the USA, specifically targeting candidates in Central and Eastern time zones. You will be instrumental in managing and optimizing our extensive Google Cloud infrastructure, which serves hundreds of engineers and millions of end-users globally. If you possess a deep passion for automation, a commitment to ensuring system scalability, reliability, security, and cost-efficiency, and a knack for practical problem-solving, this is the opportunity for you.
- • In this role, you will tackle critical projects aimed at significantly enhancing cost efficiency across our cloud operations. This includes identifying and implementing strategies to reduce network egress costs by optimizing data transmission, such as removing unnecessary headers. You will also be responsible for ensuring our warehouse data is stored in the most efficient manner, leveraging solutions like cold storage for infrequently accessed data to minimize expenses.
- • A key focus will be the optimization of autoscaling for both our databases and compute resources, ensuring we maintain high availability and performance while controlling costs. Furthermore, you will play a vital role in refining our current cost attribution models. This involves developing and implementing systems that provide all engineering teams with clear, actionable visibility into their cloud spending, fostering a culture of cost awareness and accountability.
- • Beyond cost optimization, you will actively participate in incident response and be part of the on-call rotation for platform-related incidents. Our engineering teams are distributed across America and Europe, allowing for a balanced on-call schedule that respects your personal time. You will also be a go-to resource for other engineers, providing support, troubleshooting infrastructure issues, and reviewing/approving Pull Requests that require Platform team oversight.
- • You will be an integral part of a small, highly efficient, and collaborative Site Reliability Engineering team. This team is dedicated to maintaining the stability, performance, and cost-effectiveness of Hopper's global infrastructure. Your contributions will directly impact the user experience and the operational efficiency of our platform.
- • The ideal candidate will bring a strong foundation in Site Reliability Engineering, DevOps, Software Engineering, or Systems Engineering principles. You should possess exceptional troubleshooting skills, a proven ability in system design with strong analytical capabilities, and excellent communication skills. Familiarity with major cloud providers, with a preference for Google Cloud, is essential.
- • Technical expertise should include a solid understanding of SQL, containerization technologies like Kubernetes, and related tooling such as Kustomize and Helm. Experience with Service Mesh, particularly Istio, is highly desirable. A strong grasp of networking concepts, including DNS, TLS, certificates, and ingresses, is crucial.
- • Proficiency in observability tools for log collection, metrics, and Application Performance Monitoring (APM), with a preference for Datadog, is expected. You should also have a solid understanding of security principles, including IAM, RBAC, and network security, as well as knowledge of authentication and authorization technologies.
- • Experience with CI/CD pipelines, various database technologies, and competency in scripting languages such as Bash and Python are required. This role offers a unique opportunity to shape the future of cloud cost management and infrastructure reliability at a rapidly growing travel technology company.
Skills & Technologies
Python
GCP
Kubernetes
Datadog
SSL
Senior
Remote
About Hopper Inc.
Hopper is a travel technology company that uses predictive analytics and machine learning to forecast flight and hotel prices, allowing consumers to book travel at optimal times. Founded in 2007 and headquartered in Montréal, Canada, it operates mobile-first booking platforms and provides fintech products like price freeze, cancel-for-any-reason, and rebooking guarantees to reduce travel risk.


