CrowdStrike Holdings, Inc. logo

Principal, Operational Excellence & Resilience (Remote)

Job Overview

Location

Indiana, USA

Job Type

Full-time

Category

Operations Manager

Date Posted

March 3, 2026

Full Job Description

đź“‹ Description

  • • As a Principal, Operational Excellence & Resilience at CrowdStrike, you will be at the forefront of safeguarding our global cybersecurity leadership by ensuring the unwavering reliability and rapid recovery of our technology stack. This pivotal role within the Resilience Organization is designed for a senior individual contributor who will architect and implement the strategy for technology resilience across our extensive infrastructure, applications, and cutting-edge products. You will act as the central hub in our hub-and-spoke model, fostering deep partnerships with technology business units to drive the consistent application of resilience standards and operational delivery.
  • • Your primary responsibility will be to own and meticulously maintain enterprise-wide technology resilience standards. This involves ensuring a unified approach to resilience across all domains – infrastructure, applications, and products – thereby mitigating organizational drift and upholding established frameworks. You will be instrumental in developing and refining comprehensive technical resilience architecture, encompassing robust infrastructure redundancy, sophisticated fault tolerance mechanisms, resilient application design with graceful degradation strategies, and the implementation of advanced chaos engineering frameworks for continuous, proactive resilience validation.
  • • A key facet of this role is leading the enterprise technical recovery strategy. This includes the development and implementation of robust backup and redundancy systems, defining and achieving critical Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO) for all technical systems, and establishing rigorous data recovery and restoration procedures. You will also partner closely with product and engineering teams to define and implement resilience standards specifically for our products, focusing on areas such as feature flagging, progressive deployment strategies, multi-tenancy frameworks, and scalability engineering to effectively manage growth and ensure continuous availability.
  • • You will provide critical technical oversight and aggregation of technology resilience risks across the entire enterprise. This involves establishing and diligently monitoring key performance indicators (KPIs) that measure system uptime, recovery speed, and overall resilience posture. Your insights will be crucial in identifying potential vulnerabilities and driving proactive mitigation efforts.
  • • Beyond these core responsibilities, you will also drive chaos engineering and resilience testing programs, establishing best practices for proactive validation and continuous improvement across the organization. You will own the strategy for shared resilience tooling, evaluating and implementing solutions that enhance enterprise-wide capabilities in monitoring, testing, and recovery automation.
  • • Stakeholder engagement is paramount. You will build and maintain formal networks with key constituents across business units, engineering teams, and external partners, fostering a collaborative environment focused on resilience. During major incident responses, you will serve as a senior technical advisor, lending your expertise on technical recovery strategies and coordinating cross-functional recovery efforts to minimize impact and restore services swiftly.
  • • Furthermore, you will be a catalyst for innovation, driving advancements in resilience practices by identifying emerging technologies and methodologies that can enhance CrowdStrike's competitive resilience advantage. This includes providing strategic guidance and knowledge transfer to junior team members and cross-functional partners, elevating the overall resilience engineering expertise within the company.
  • • This role demands a proactive, strategic thinker with a deep understanding of enterprise-scale cloud-native environments and a proven ability to influence and lead complex, cross-functional programs. You will be instrumental in shaping the future of technology resilience at CrowdStrike, ensuring our platform remains robust, reliable, and secure for our global customer base.

Skills & Technologies

AWS
Azure
GCP
Senior
Remote
Degree Required

Ready to Apply?

You will be redirected to an external site to apply.

CrowdStrike Holdings, Inc. logo
CrowdStrike Holdings, Inc.
Visit Website

About CrowdStrike Holdings, Inc.

CrowdStrike Holdings, Inc. provides cloud-delivered cybersecurity through the Falcon platform, combining next-generation antivirus, endpoint detection and response, threat hunting, and IT hygiene. Its AI-driven analytics correlate trillions of events weekly to identify malware-free intrusions, nation-state actors, and insider threats across endpoints, workloads, and identities. The company sells subscriptions, professional services, and threat intelligence to enterprises worldwide.

Similar Opportunities

Essen, Rhode Island
Full-time
Expires Apr 25, 2026
Remote

15 days ago

Apply
Essen
Full-time
Expires May 8, 2026
Hybrid

2 days ago

Apply
Toronto, California, Canada
Full-time
Expires May 3, 2026
Junior
Onsite

6 days ago

Apply
Columbus, Canada
Full-time
Expires Apr 25, 2026
Remote
Degree Required

15 days ago

Apply