This job has expired

This position was posted on March 3, 2026 and is likely no longer accepting applications. We've kept it here for historical reference. Check out the similar jobs below!

Staff Engineer - DevOps

Weekday Technologies Inc.

Job Overview

Location

Remote

Job Type

Full-time

Full Job Description

📋 Description

• As a Staff Engineer specializing in DevOps, you will be instrumental in shaping and advancing our cloud infrastructure and operational excellence. This pivotal role involves architecting sophisticated DevOps ecosystems, driving significant cloud cost governance initiatives, and implementing cutting-edge container orchestration practices to ensure our systems are not only robust and scalable but also cost-efficient.
• You will collaborate closely with cross-functional teams, including engineering, security, and finance, to foster a culture of operational excellence. Your responsibilities will extend to proactively managing and optimizing infrastructure spend, ensuring that our technological investments deliver maximum value and align with our financial objectives.
• A core aspect of this role is leading the end-to-end DevOps strategy. This encompasses the design, implementation, and continuous improvement of CI/CD pipelines, robust automation frameworks, infrastructure-as-code principles, and efficient release engineering processes. You will be the driving force behind establishing and maintaining best practices in DevOps, setting high standards for reliability, and implementing effective operational governance across the organization.
• You will be tasked with designing scalable, resilient, and cloud-native architectures that are meticulously aligned with our business growth trajectory. This requires a forward-thinking approach to infrastructure planning, anticipating future needs and ensuring our systems can adapt and expand seamlessly.
• A significant focus will be placed on Kubernetes and containerization. You will architect and manage large-scale Kubernetes environments specifically designed for production workloads. This involves optimizing workloads across multiple clusters for peak performance, unwavering reliability, and optimal cost efficiency. You will be responsible for building and maintaining containerized applications using Docker and Kubernetes, ensuring their portability, scalability, and ease of deployment across diverse environments.
• Furthermore, you will drive the implementation of multi-cluster and multi-region deployments where necessary, enhancing our system's resilience and availability to meet stringent Service Level Agreements (SLAs).
• In the realm of cost savings and planning, you will own the critical functions of infrastructure cost visibility and optimization. This involves developing and executing comprehensive cloud cost-saving strategies, including precise rightsizing of resources, strategic reserved capacity planning, intelligent auto-scaling optimization, and efficient workload scheduling. You will work in close partnership with finance teams to contribute to budgeting, forecasting, and long-term cost planning, ensuring financial prudence in our infrastructure operations.
• You will be responsible for creating sophisticated dashboards and reporting mechanisms that provide clear insights into infrastructure Return on Investment (ROI) and critical spend trends. Your continuous efforts will be directed towards identifying inefficiencies and implementing measurable cost-reduction initiatives without ever compromising system performance or reliability.
• For monitoring and observability, you will design and implement comprehensive monitoring systems leveraging tools like Grafana and other leading observability platforms. This includes building real-time dashboards that offer a clear view of system health, performance metrics, and crucial cost insights. You will establish robust alerting frameworks designed to minimize downtime and significantly improve incident response times. Your work will directly contribute to enhancing system reliability through data-driven monitoring and thorough post-incident analysis.
• Automation and reliability are paramount. You will automate critical processes such as provisioning, deployments, scaling, and recovery. Your efforts will focus on improving system resilience, maximizing availability, and refining disaster recovery strategies. You will also lead root cause analysis for major incidents, ensuring that preventive measures are implemented effectively to avoid recurrence and maintain operational stability.

Skills & Technologies

AWS

Azure

GCP

Docker

Kubernetes

DevOps

Senior

Remote

Ready to Apply?

Apply Externally

You will be redirected to an external site to apply.

Weekday Technologies Inc.

Visit Website

About Weekday Technologies Inc.

Weekday Technologies operates a hiring platform that connects tech companies with pre-vetted software engineers through community referrals. The product crowdsources candidate recommendations from existing engineering teams, verifies skills, and offers employers a searchable talent pool for contract and full-time roles. Founded in 2021 and headquartered in San Francisco, the company focuses on reducing time-to-hire for startups and scale-ups by leveraging trusted peer networks rather than traditional recruiting pipelines.

View Company Profile

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.