
Job Overview
Location
United States & Canada
Job Type
Full-time
Category
Software Engineering
Date Posted
February 24, 2026
Full Job Description
đź“‹ Description
- • Join Babylist, the premier registry, e-commerce, and content platform for growing families, as a Senior Software Engineer, Site Reliability. We are a remote-first company with a significant impact, serving over 9 million shoppers annually and driving over $1 billion in GMV. Our mission is to connect growing families with everything they need to thrive, and our technology stack includes Ruby on Rails, React, AWS, Sidekiq, MySQL, Redis, and native iOS/Android applications.
- • In this pivotal role within our Platform team, you will be instrumental in ensuring the unwavering stability, scalability, and reliability of Babylist's systems and services. You will collaborate closely with all Engineering teams, providing essential support for our shared infrastructure and developer tools. Your deep expertise in site reliability engineering, robust AWS cloud infrastructure management, and modern DevOps practices will be key to optimizing our systems and fostering a culture of continuous improvement.
- • You will take ownership of managing and building our AWS infrastructure, leveraging Infrastructure as Code (IaC) principles with tools like Terraform. This includes ensuring our EKS clusters and databases are maintained with up-to-date versions, actively optimizing for peak performance and enhanced reliability.
- • A significant part of your contribution will involve enhancing the speed and reliability of our Continuous Integration (CI) systems. By improving these critical pipelines, you will empower the entire Engineering Team, enabling faster, more efficient development cycles and deployment processes.
- • You will act as a crucial support resource for developers, assisting them in troubleshooting complex issues that may arise across local development, staging, and production environments. Your ability to diagnose and resolve problems swiftly will be vital to maintaining smooth operations.
- • You will be responsible for establishing, communicating, and supporting best practices for monitoring and alerting. This entails setting up sophisticated monitoring systems and defining clear, actionable alerts to facilitate proactive incident management and minimize downtime.
- • Your role will involve contributing to the design, deployment, and management of containerized applications using Docker and Kubernetes, ensuring these are robust and scalable within our environment.
- • You will apply your solid understanding of cloud-native systems design, encompassing CDNs, load balancers, cloud networking, DNS, caching strategies, and distributed systems, to build and maintain resilient infrastructure.
- • Troubleshooting and debugging will be core to your responsibilities, where your natural ability to quickly identify and resolve issues across diverse environments will be highly valued.
- • You will contribute to the design and support of CI systems, potentially working with tools such as CircleCI, Jenkins, or GitHub Actions.
- • You will leverage your familiarity with monitoring and alerting best practices, utilizing tools like Datadog, Cronitor, Sentry, and PagerDuty to ensure proactive issue identification and swift resolution.
- • Your proven experience in on-call management best practices, including effective incident response, clear escalation procedures, and thorough post-incident reviews, will be essential for driving continuous improvement and upholding system reliability.
- • You will be an active participant in our AI-forward environment, embracing and utilizing AI tools to enhance daily operations, boost innovation, and improve overall impact, while always keeping a human-centered approach.
- • You will collaborate effectively with cross-functional teams, leveraging your excellent verbal and written communication skills to share knowledge and drive alignment.
- • Babylist fosters a culture of focused work and intentional recharging, exceptional management, and building products that positively impact millions of lives. We are committed to embedding AI intentionally to support innovation and scale.
- • We offer meaningful opportunities for career advancement, believing that technology and data can solve hard problems, and are committed to performance-based progression.
- • You will be part of a team that values working with focus and intention, then stepping away to recharge, believing in exceptional management and investing in tools and opportunities to connect with colleagues.
- • You will contribute to building products that positively impact millions of people's lives, and work in an environment where AI is intentionally embedded to support innovation and scale.
- • We are committed to career progression and performance-based advancement, offering competitive pay and opportunities for growth.
- • You will work remotely from the United States or Canada, with opportunities to connect in person twice a year for company and departmental gatherings.
🎯 Requirements
- • 8+ years of experience as a Site Reliability Engineer or similar role, with a strong background in maintaining highly available and scalable systems.
- • Proficiency with Terraform for managing AWS infrastructure using Infrastructure as Code (IaC) practices.
- • Strong experience working with AWS cloud-based infrastructure and services, ensuring their reliability, performance, and security.
- • Proficiency with Docker and Kubernetes for designing, deploying, and managing containerized applications.
- • Solid understanding of cloud-native systems design, including CDNs, load balancers, cloud networking, DNS, caching, and distributed systems.
🏖️ Benefits
- • Competitive salary with equity and bonus opportunities.
- • Company-paid medical, dental, and vision insurance.
- • Generous paid parental leave and PTO.
- • Remote work stipend to set up your home office.
- • Perks for physical, mental, and emotional health, parenting, childcare, and financial planning.
Skills & Technologies
About Babylist Inc.
Babylist operates an online baby registry platform that lets expectant parents add items from any retailer, book services, and create cash funds. It aggregates products across major stores, offers a universal registry browser button, and provides personalized checklists. The company also sells its own line of baby merchandise, including nursery furniture, strollers, and accessories, through its e-commerce storefront. Headquartered in Oakland, California, Babylist supports parents with expert guides, product reviews, and a mobile app for managing registries, tracking gifts, and coordinating shipping to recipients.
Similar Opportunities

InsiderOne LLC
2 months ago

Faith Technologies, Inc.
2 months ago

