This job has expired
This position was posted on October 18, 2025 and is likely no longer accepting applications. We've kept it here for historical reference. Check out the similar jobs below!

Job Overview
Location
Remote
Job Type
Full-time
Category
Software Engineering
Date Posted
October 18, 2025
Full Job Description
đź“‹ Description
- • Own and evolve the full observability stack for Temporal’s cloud-native, multi-region platform—spanning metrics, logs, traces, and alerting—so that every engineer can understand system health in under 30 seconds and every customer can self-diagnose issues without opening a ticket.
- • Architect, build, and maintain high-throughput telemetry pipelines that ingest billions of events per day from Kubernetes, Go services, Envoy, and customer workloads, then transform, store, and serve that data with sub-second query latency.
- • Drive the SDLC from inception to production: gather requirements from SRE, product, and customer success; write detailed design docs; conduct architecture reviews; ship code in Go, TypeScript, and Terraform; and automate deployment via Argo CD and Helm.
- • Design and implement customer-facing observability features—think self-service dashboards, trace correlation, and intelligent alerting—so Temporal Cloud users can debug their workflows as easily as they debug local code.
- • Champion SLOs, error budgets, and data-driven reliability practices across the company; instrument every critical path and make SLI/SLO data visible in real time to both engineers and executives.
- • Reduce mean-time-to-detection (MTTD) and mean-time-to-resolution (MTTR) by at least 50 % through automated anomaly detection, on-call runbooks-as-code, and chaos-engineering experiments.
- • Mentor junior engineers and lead cross-team guilds on topics like OpenTelemetry best practices, Prometheus query optimization, and cost-aware cardinality management.
- • Collaborate with open-source maintainers to upstream improvements to projects such as Prometheus, Thanos, Grafana, and Jaeger—turning Temporal’s internal innovations into community wins.
- • Continuously optimize observability spend—negotiating with vendors, tuning retention policies, and leveraging object storage tiers—to keep per-customer costs flat while data volume triples.
- • Influence the product roadmap by translating customer pain points into concrete observability features and delivering MVPs in tight two-week iterations.
- • Participate in a lightweight on-call rotation (one week in eight) where you’ll use the same tools you build, ensuring dog-fooding and rapid feedback loops.
- • Document tribal knowledge into runbooks, architecture decision records (ADRs), and internal tech talks so that every new hire can ramp up in days, not weeks.
🎯 Requirements
- • 5+ years of backend or infrastructure engineering experience, including 2+ years designing and operating observability systems at petabyte scale
- • Expert-level proficiency in Go or another systems language (Rust, C++, Java) and hands-on experience with Kubernetes, Docker, and cloud-native architectures
- • Deep knowledge of Prometheus, Thanos, Grafana, Loki, Tempo, or similar CNCF observability projects, plus experience with OpenTelemetry instrumentation
- • Demonstrated ability to drive cross-functional initiatives, write detailed technical specs, and present to both technical and non-technical stakeholders
- • Nice-to-have: contributions to open-source observability projects, experience with multi-region cloud deployments (AWS, GCP, Azure), and familiarity with workflow orchestration or Temporal itself
🏖️ Benefits
- • Fully remote-first culture with quarterly off-sites in exciting global locations and generous home-office stipend
- • Competitive salary plus equity in a well-funded, high-growth startup backed by top-tier VCs
- • Flexible PTO policy (minimum 20 days encouraged) and company-wide mental health days
- • Comprehensive health, dental, vision, and life insurance for you and dependents, with 100 % premiums covered
- • Annual learning & conference budget ($3,500) and dedicated 10 % innovation time every Friday
Skills & Technologies
About Temporal Technologies Inc.
Temporal Technologies Inc. provides an open-source, code-first workflow orchestration platform that guarantees durable execution of distributed applications through automatic retry, state recovery and visibility. Developers write workflows in familiar languages; the platform handles failures, retries and scaling so business logic remains simple and reliable. It is trusted by enterprises to run critical long-running services without custom infrastructure.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities

Heidi Health Pty Ltd
2 months ago

Tessera Labs, Inc.
2 months ago

