
Job Overview
Location
United States
Job Type
Full-time
Category
DevOps & SysAdmin
Date Posted
May 3, 2026
Full Job Description
đź“‹ Description
- • The Site Reliability Specialist (Observability & Kubernetes) role at Everbridge is critical to ensuring the reliability, scalability, and security of the company’s cloud platform that powers critical services used globally.
- • Day to day, the person will design, implement, and maintain observability solutions (metrics, logging, tracing) and manage Kubernetes clusters to ensure high availability and performance of platform services.
- • The team is part of Everbridge’s platform engineering group, focused on building resilient, secure, and developer-friendly infrastructure that enables rapid, safe innovation across the organization.
- • In this role, the person will deepen expertise in cloud-native technologies, advance skills in SRE practices, and contribute directly to platform stability and security at scale.
🎯 Requirements
- • Experience with Kubernetes administration and orchestration in production environments
- • Proficiency in observability tools and practices (e.g., Prometheus, Grafana, ELK stack, OpenTelemetry)
- • Strong background in Linux systems, networking, and cloud infrastructure (AWS, Azure, or GCP)
- • Experience with infrastructure-as-code (Terraform, Ansible) and CI/CD pipelines
- • Knowledge of security best practices in cloud and containerized environments
- • Ability to troubleshoot complex distributed systems and improve system reliability
🏖️ Benefits
- • Remote, home-based work flexibility anywhere in the United States
- • Opportunity to work at the intersection of cloud infrastructure, security, and developer experience
- • Chance to shape how reliability and observability are embedded into a global-scale platform
- • Professional growth in SRE, DevOps, and cloud-native technologies
- • Collaboration with cross-functional engineering teams to enable secure, fast innovation
Skills & Technologies
About Everbridge, Inc.
Everbridge, Inc. provides critical event management and enterprise safety software that helps organizations anticipate, respond to, and recover from threats to people, operations, and assets. Its SaaS platform automates communications during public safety hazards, IT outages, cyberattacks, and severe weather, delivering targeted alerts across voice, text, email, and mobile apps. Governments, healthcare systems, and Fortune 500 customers use the system to locate at-risk personnel, orchestrate response workflows, and analyze operational resilience. Everbridge integrates threat intelligence, risk visualization, and travel tracking to shorten response times and meet duty-of-care obligations.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities

Pragmatike Soluciones TecnolĂłgicas S.L.
2 months ago

Workato, Inc.
2 months ago
