This job has expired

This position was posted on December 12, 2025 and is likely no longer accepting applications. We've kept it here for historical reference. Check out the similar jobs below!

Software Engineer, Infrastructure

Orb, Inc.

Job Overview

Location

San Francisco

Job Type

Full-time

Full Job Description

📋 Description

• Own the beating heart of Orb’s billing engine: you’ll design, build, and harden the distributed systems that ingest billions of usage events per day, transform them into revenue-grade data, and deliver real-time invoices to customers like Vercel and Pinecone. Every millisecond and every 9 you add directly protects our customers’ bottom line.
• Architect for zero-downtime scale. You’ll lead end-to-end initiatives—from tenant-isolation strategies and circuit-breaker patterns to regional fail-over playbooks—that let us 10× traffic without waking the on-call pager. You’ll run load tests, chaos experiments, and game-days to prove our systems before customers feel a hiccup.
• Turn observability into a super-power. You’ll instrument every layer of our stack (Kafka, ClickHouse, Python services, AWS) with golden signals, SLOs, and trace-to-cost dashboards that surface anomalies in minutes, not hours. Expect to ship self-healing automations that page humans only when creativity is required.
• Build performance-critical, user-facing infrastructure. From sub-second usage analytics APIs to real-time pricing experiment engines, you’ll squeeze every ounce of speed out of our data path while keeping costs predictable. You’ll profile JVMs, tune Kafka partitions, and rewrite hot paths in Rust or Cython when the data demands it.
• Be the scaling partner every product squad wants at the table. You’ll join design reviews early, translate product requirements into resilient architectures, and write rollout runbooks so new features land safely at 3 pm on a Friday. Your mentorship will raise the reliability bar across all of engineering.
• Shape Orb’s infrastructure roadmap. You’ll balance cutting-edge (e.g., tiered storage in ClickHouse, Kafka tiered storage, Graviton migrations) with battle-tested patterns, presenting trade-off docs to leadership and getting buy-in to invest in long-term leverage.
• Debug the impossible at 2 am—and make sure it never happens again. You relish diving into distributed traces, heap dumps, and kernel syscalls, then turning war stories into post-mortems that the whole team learns from.
• Champion a culture of operational excellence. You’ll define SLAs/SLOs, automate on-call rotations, and build tooling that turns incident response into a learning loop. Expect to mentor junior engineers on safe deploys, feature flags, and progressive delivery.
• Work shoulder-to-shoulder with a tight-knit, high-velocity team in downtown San Francisco (3 days in office). We whiteboard at 9 am, ship by noon, and celebrate wins over catered lunches. Your voice will be heard from day one—our flat structure means the best idea wins, no matter the source.

🎯 Requirements

• 5+ years of software engineering experience with at least 4 years focused on infrastructure, platform, or backend systems at scale.
• Proven track record of designing and operating distributed, high-throughput systems (Kafka, Spark, or equivalent) in production with strict availability targets.
• Deep expertise in observability: you’ve built SLIs/SLOs, tuned Prometheus/Grafana, and used distributed tracing to debug latency outliers.
• Strong coding skills in Python, Go, Java, or similar; comfort reading and reasoning about multi-threaded, async, or event-driven code.
• Nice to have: hands-on experience with AWS (EKS, RDS, MSK), infrastructure-as-code (Terraform/CDK), and datastores like PostgreSQL, ClickHouse, or Druid.

🏖️ Benefits

• Excellent medical, dental, and vision insurance with 100% employee premiums covered.
• Unlimited PTO plus an extra company-wide week off between Christmas and New Year’s.
• 401(k) plan with competitive match, meaningful equity in the form of stock options, and 16-week fully paid parental leave with continuous equity vesting.
• Commuter stipend, daily catered lunches in the office, and a dog-friendly SOMA workspace three days a week.

Skills & Technologies

Python

TypeScript

React

PostgreSQL

AWS

DevOps

Onsite

Remote

Ready to Apply?

Apply Externally

You will be redirected to an external site to apply.

AI Job Fit Analysis

Pro

See exactly how your profile matches this role — strengths, skill gaps, and what to do about them.

Orb, Inc.

Visit Website

About Orb, Inc.

Orb is a comprehensive revenue design platform tailored for the AI era, empowering AI-native and software-first businesses with flexible pricing, real-time usage-based billing, and advanced monetization strategies. It unifies product data, billing logic, and financial reporting, enabling product, finance, and engineering teams to design, simulate, and scale revenue with clarity and confidence. Orb supports diverse pricing models, from tokens and API calls to outcome-based charges, allowing companies to rapidly iterate on monetization without engineering bottlenecks. Trusted by leading platforms for global scalability, Orb handles complex billing requirements, processing over 1.5 million invoices monthly for its users.

View Company Profile

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.