Authzed Inc. logo

Sr. Site Reliability Engineer

Job Overview

Location

United States

Job Type

Full-time

Category

Software Engineering

Date Posted

June 4, 2026

Full Job Description

đź“‹ Description

  • • Design, implement, and maintain highly available and scalable infrastructure solutions to support Authzed’s products and growing customer base.
  • • Monitor and analyze system performance, identifying and resolving bottlenecks and issues to ensure optimal reliability and uptime.
  • • Automate infrastructure deployment and configuration management using infrastructure-as-code tools such as Terraform and Pulumi.
  • • Continuously improve system reliability, security, and efficiency through proactive monitoring, capacity planning, and performance tuning.
  • • Troubleshoot and resolve complex infrastructure and application issues in both production and test environments.
  • • Collaborate with software engineering teams to design and implement resilient, scalable, and secure systems.
  • • Participate in an on-call rotation to respond to production incidents in a timely and effective manner.
  • • Document system configurations, operational procedures, and troubleshooting guidelines to ensure knowledge sharing and operational consistency.
  • • Work with containerization technologies including Docker and Kubernetes to manage and scale services.
  • • Utilize monitoring and logging tools such as Prometheus, Grafana, and the ELK stack to observe system health and drive improvements.
  • • Apply strong problem-solving skills to diagnose and resolve distributed system failures and performance degradation.
  • • Engage with Git and GitHub for version control and collaboration across engineering workflows.
  • • Integrate and maintain continuous integration and deployment systems to enable rapid, reliable releases.
  • • Contribute to the development and maintenance of SDKs for NodeJS, Java, Python, Ruby, and Go by ensuring underlying infrastructure supports their performance and reliability.
  • • Leverage experience with distributed SQL databases such as Google Cloud Spanner or CockroachDB to optimize data layer reliability and scalability.
  • • Operate within a fully remote, globally distributed team with flexible scheduling across US, Canada, and European timezones.
  • • Participate in twice-yearly team offsites focused on collaboration, bonding, and strategic alignment.
  • • Work in a software-driven culture where even non-engineering teams deeply understand and engage with the technology.
  • • Contribute to a company culture rooted in agency, collaboration, and open-mindedness, where diverse perspectives drive innovation.
  • • Help build and maintain the authorization infrastructure that enables enterprises to manage access control at scale.

🎯 Requirements

  • • Proven experience as a Site Reliability Engineer or in a similar role.
  • • Strong understanding of networking, operating systems, and cloud infrastructure.
  • • Experience with containerization technologies such as Docker and Kubernetes.
  • • Experience with infrastructure-as-code tools like Terraform and Pulumi.
  • • Familiarity with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
  • • Experience working with Git and GitHub and continuous integration/deployment systems.

🏖️ Benefits

  • • Competitive salary based on experience.
  • • Stock options at an early-stage startup.
  • • Comprehensive healthcare and other insurance benefits for US-based employees.
  • • Fully remote work with flexible scheduling across global timezones.
  • • Twice-yearly team offsites for collaboration and team bonding.

Skills & Technologies

Python
Java
Ruby
Node.js
GCP
Senior
Remote

Ready to Apply?

You will be redirected to an external site to apply.

Authzed Inc. logo
Authzed Inc.
Visit Website

About Authzed Inc.

Authzed builds managed authorization infrastructure. Its flagship product, SpiceDB, is an open-source, Google Zanzibar-inspired permissions database that developers embed to add fine-grained access control to applications. The company offers cloud-hosted and self-managed editions, plus tooling for schema design, testing, and observability. Customers span SaaS, fintech, and gaming verticals seeking centralized, auditable policy enforcement without re-architecting existing services.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

Expired
Seattle, WA
Full-time
Expired May 6, 2026
Java
Go
AWS
+4 more

3 months ago

Apply
Expires soon
US Remote
Full-time
Expires Jun 9, 2026 (Soon)
Python
Java
GitHub
+2 more

2 months ago

Apply
Expired
Remote
Full-time
Expired Apr 1, 2026
Remote

4 months ago

Apply
Expired
Stand Insurance Company logo

Stand Insurance Company

San Francisco
Full-time
Expired Jun 4, 2026
Onsite

2 months ago

Apply