This job has expired

This position was posted on March 7, 2026 and is likely no longer accepting applications. We've kept it here for historical reference. Check out the similar jobs below!

Senior Site Reliability Engineer, Platform & Cloud FinOps (100% Remote - Spain)

Hopper Inc.

Job Overview

Location

Spain - Remote

Job Type

Full-time

Full Job Description

📋 Description

• Join Hopper Inc. as a Senior Site Reliability Engineer, focusing on Platform & Cloud FinOps, in a fully remote role based in Spain. This is a unique opportunity to contribute to a large-scale Google Cloud infrastructure that supports hundreds of engineers in delivering exceptional experiences to millions of end-users worldwide.
• As a Senior SRE on the Cloud FinOps team, you will be instrumental in driving significant cost efficiencies across our cloud infrastructure. Your primary focus will be on identifying and implementing solutions to optimize spending without compromising performance or reliability.
• You will tackle challenging projects such as reducing network egress costs by meticulously analyzing and eliminating unnecessary data transmission, such as redundant headers. This requires a deep understanding of network protocols and cloud networking services.
• A key responsibility will involve optimizing data storage strategies. You will analyze how our warehouse data is utilized and implement policies to ensure the most cost-effective storage solutions are employed, for instance, migrating infrequently accessed data to cold storage tiers to reduce costs.
• Furthermore, you will fine-tune autoscaling configurations for both databases and compute resources. This involves ensuring that our systems scale up and down dynamically and efficiently in response to demand, preventing over-provisioning and minimizing idle resources.
• A significant part of your role will be enhancing our cost attribution systems. You will work to provide all engineering teams with clear, granular visibility into their cloud spending, enabling them to make informed decisions about resource utilization and cost management.
• Beyond cost optimization, you will actively participate in incident response, contributing to the platform's overall reliability and stability. This includes being part of an on-call rotation for critical platform incidents. Given the distributed nature of the team across the Americas and Europe, this rotation is designed to allow for comfortable working hours.
• You will also serve as a subject matter expert, assisting other engineers with infrastructure-related queries and challenges. This involves troubleshooting complex issues and providing guidance on best practices for utilizing our platform.
• A crucial aspect of the role involves reviewing and approving Pull Requests (PRs) that require Platform team oversight, ensuring adherence to established standards for reliability, security, and cost-effectiveness.
• You will be an integral part of a small, highly effective, and collaborative team of SREs, working in an environment that values automation, practical problem-solving, and continuous improvement.
• The ideal candidate possesses a strong foundation in Site Reliability Engineering, DevOps, Software Engineering, or Systems Engineering, coupled with exceptional troubleshooting and analytical capabilities.
• You will engage in system design discussions, contributing to the architecture of scalable, reliable, and cost-efficient systems.
• Excellent communication skills are essential for collaborating with cross-functional teams and articulating technical solutions.
• Experience with major cloud providers, with a strong preference for Google Cloud, is highly desirable.
• Proficiency in SQL is required for data analysis and cost attribution tasks.
• Hands-on experience with containerization technologies like Kubernetes and associated tooling such as Kustomize and Helm is expected.
• Familiarity with Service Mesh technologies, particularly Istio, will be a significant advantage.
• A solid understanding of networking concepts, including DNS, TLS, certificates, and ingresses, is crucial.
• Expertise in observability tools for log collection, metrics, and Application Performance Monitoring (APM), preferably Datadog, is required.
• Knowledge of security best practices, including IAM, RBAC, and network security, is essential.
• Experience with authentication and authorization technologies is expected.
• Familiarity with CI/CD pipelines and best practices is necessary.
• Experience with various database technologies is required.
• Competency in scripting languages such as Bash and Python is a must for automation tasks.

Skills & Technologies

Python

GCP

Kubernetes

Datadog

SSL

Senior

Remote

Ready to Apply?

Apply Externally

You will be redirected to an external site to apply.

Hopper Inc.

Visit Website

About Hopper Inc.

Hopper is a travel technology company that uses predictive analytics and machine learning to forecast flight and hotel prices, allowing consumers to book travel at optimal times. Founded in 2007 and headquartered in Montréal, Canada, it operates mobile-first booking platforms and provides fintech products like price freeze, cancel-for-any-reason, and rebooking guarantees to reduce travel risk.

View Company Profile

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.