
Job Overview
Location
Indiana, USA
Job Type
Full-time
Category
Operations Manager
Date Posted
March 3, 2026
Full Job Description
đź“‹ Description
- • As a Principal, Operational Excellence & Resilience at CrowdStrike, you will be at the forefront of safeguarding our global cybersecurity leadership by ensuring the unwavering reliability and rapid recovery of our technology stack. This pivotal role within the Resilience Organization is designed for a senior individual contributor who will architect and implement the strategy for technology resilience across our extensive infrastructure, applications, and cutting-edge products. You will act as the central hub in our hub-and-spoke model, fostering deep partnerships with technology business units to drive the consistent application of resilience standards and operational delivery.
- • Your primary responsibility will be to own and meticulously maintain enterprise-wide technology resilience standards. This involves ensuring a unified approach to resilience across all domains – infrastructure, applications, and products – thereby mitigating organizational drift and upholding established frameworks. You will be instrumental in developing and refining comprehensive technical resilience architecture, encompassing robust infrastructure redundancy, sophisticated fault tolerance mechanisms, resilient application design with graceful degradation strategies, and the implementation of advanced chaos engineering frameworks for continuous, proactive resilience validation.
- • A key facet of this role is leading the enterprise technical recovery strategy. This includes the development and implementation of robust backup and redundancy systems, defining and achieving critical Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO) for all technical systems, and establishing rigorous data recovery and restoration procedures. You will also partner closely with product and engineering teams to define and implement resilience standards specifically for our products, focusing on areas such as feature flagging, progressive deployment strategies, multi-tenancy frameworks, and scalability engineering to effectively manage growth and ensure continuous availability.
- • You will provide critical technical oversight and aggregation of technology resilience risks across the entire enterprise. This involves establishing and diligently monitoring key performance indicators (KPIs) that measure system uptime, recovery speed, and overall resilience posture. Your insights will be crucial in identifying potential vulnerabilities and driving proactive mitigation efforts.
- • Beyond these core responsibilities, you will also drive chaos engineering and resilience testing programs, establishing best practices for proactive validation and continuous improvement across the organization. You will own the strategy for shared resilience tooling, evaluating and implementing solutions that enhance enterprise-wide capabilities in monitoring, testing, and recovery automation.
- • Stakeholder engagement is paramount. You will build and maintain formal networks with key constituents across business units, engineering teams, and external partners, fostering a collaborative environment focused on resilience. During major incident responses, you will serve as a senior technical advisor, lending your expertise on technical recovery strategies and coordinating cross-functional recovery efforts to minimize impact and restore services swiftly.
- • Furthermore, you will be a catalyst for innovation, driving advancements in resilience practices by identifying emerging technologies and methodologies that can enhance CrowdStrike's competitive resilience advantage. This includes providing strategic guidance and knowledge transfer to junior team members and cross-functional partners, elevating the overall resilience engineering expertise within the company.
- • This role demands a proactive, strategic thinker with a deep understanding of enterprise-scale cloud-native environments and a proven ability to influence and lead complex, cross-functional programs. You will be instrumental in shaping the future of technology resilience at CrowdStrike, ensuring our platform remains robust, reliable, and secure for our global customer base.
Skills & Technologies
AWS
Azure
GCP
Senior
Remote
Degree Required
About CrowdStrike Holdings, Inc.
CrowdStrike Holdings, Inc. provides cloud-delivered cybersecurity through the Falcon platform, combining next-generation antivirus, endpoint detection and response, threat hunting, and IT hygiene. Its AI-driven analytics correlate trillions of events weekly to identify malware-free intrusions, nation-state actors, and insider threats across endpoints, workloads, and identities. The company sells subscriptions, professional services, and threat intelligence to enterprises worldwide.
Similar Opportunities
Toronto, California, Canada
Full-time
Expires May 3, 2026
Junior
Onsite
6 days ago



