
Job Overview
Location
Remote
Job Type
Full-time
Category
DevOps & SysAdmin
Date Posted
May 8, 2026
Full Job Description
đź“‹ Description
- • Own the strategy, architecture, and operational excellence of the data infrastructure as a Staff Database Reliability Engineer, an expert-level individual contributor role with deep influence on engineering direction.
- • Design scalable database schemas and access patterns, tune Aurora for latency and throughput, and establish standards for how engineers interact with databases, including resolving incidents like ACCESS EXCLUSIVE lock contention through runbooks and automation.
- • Make the Django ORM a strength by reviewing migrations for safety at scale, catching N+1 patterns, establishing QuerySet and schema conventions, and scaling review through automation via AGENTS.md files and AI review bots (Claude Code, Cursor).
- • Lead major infrastructure initiatives including capacity planning, zero-downtime schema migrations, multi-AZ resilience (Aurora writer/reader placement, failover, RTO/RPO), backups, PITR, and failover testing.
- • Own the CDC pipeline from Aurora to DMS to S3 Parquet to Snowflake, ensuring schema evolution safety, Parquet layout optimization, and automated checks to prevent downstream breaks.
- • Drive observability using pganalyze (query performance), CloudWatch (infrastructure metrics), and Honeycomb (high-cardinality tracing), shaping how they integrate with Django-side instrumentation and trace attributes.
- • Build tooling and guardrails such as migration review automation, CI checks for risky patterns, slow query pipelines, and self-service dashboards for team query footprint visibility.
- • Support and evolve adjacent systems including OpenSearch (index design, sharding), Redis (caching patterns, eviction), and SQS/RabbitMQ (queue design, DLQs, consumer backpressure).
- • Partner closely with platform, backend, and DevOps engineers to influence platform-wide reliability and performance standards.
- • Operate in a high-growth environment where solutions must scale from 50 to 100 engineers, requiring pragmatic, forward-thinking infrastructure decisions.
🎯 Requirements
- • Deep PostgreSQL expertise including EXPLAIN (ANALYZE, BUFFERS), MVCC, bloat, lock contention, vacuum/autovacuum, with strong preference for Aurora Serverless V2 / Limitless experience.
- • Strong ORM fluency (Django, SQLAlchemy, ActiveRecord, or similar) — ability to predict SQL generation, spot N+1 problems, and control eager loading via joins or batched IN queries.
- • Production CDC experience, ideally with AWS DMS, including logical replication, slot hygiene, schema evolution, and Parquet-based data lakes feeding Snowflake or similar.
- • Hands-on experience with pganalyze (or Datadog DBM / Performance Insights / pg_stat_statements), CloudWatch (custom metrics, composite alarms, log insights), and Honeycomb (or similar high-cardinality tracing tool), including comfort with OpenTelemetry.
- • Real experience making AI coding and review tools useful for teams — writing AGENTS.md files, configuring review agents (Claude Code, Cursor), and iterating on prompts and configs.
- • Strong automation and IaC background with production code in Python, Go, or similar, and Terraform.
🏖️ Benefits
- • Competitive salaries ($200k-$250k base + equity)
- • Comprehensive healthcare benefits
- • Exciting and motivating equity
- • Flexible PTO
- • 401k
- • Parental Leave
- • WFH Stipend
- • Commuter Benefits (for SF office employees)
- • Some of the nicest and smartest teammates you’ll ever work with
Skills & Technologies
About Scribe
ScribeHow.com is the fastest way to turn any process into a step-by-step visual guide. Our AI screen recorder watches your workflow, auto-writes crystal-clear instructions, adds annotated screenshots, and publishes a branded, shareable how-to in minutes. Replace endless Zoom walk-throughs, stale PDFs, and ticket backlogs with living docs that customers, teammates, and new hires can replay, search, and translate instantly. Built for SaaS onboarding, SOPs, support, and training, ScribeHow slashes ramp-up time, cuts support volume, and keeps knowledge always up to date so teams scale without slowing down.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities

Pragmatike Soluciones TecnolĂłgicas S.L.
2 months ago

Workato, Inc.
2 months ago
