This job has expired

This position was posted on September 16, 2025 and is likely no longer accepting applications. We've kept it here for historical reference. Check out the similar jobs below!

Vantage Analytics Inc. logo

Senior Site Reliability Engineer

Job Overview

Location

Toronto, Indiana, USA

Job Type

Full-time

Category

Data Science

Date Posted

September 16, 2025

Full Job Description

đź“‹ Description

  • • Own the reliability and performance of Vantage’s distributed retail-media platform, ensuring 99.9 %+ uptime for services that power advertising campaigns for global retailers like The Home Depot.
  • • Serve as the technical anchor for a remote-first SRE squad, mentoring junior engineers while still rolling up your sleeves to debug, automate, and optimize everything from Kubernetes clusters to Snowflake data pipelines.
  • • Design and maintain resilient, multi-region infrastructure on Azure (with future expansion to AWS/GCP) using Infrastructure-as-Code best practices—Terraform, Terragrunt, and Ansible—to provision, configure, and version every layer of the stack.
  • • Build and evolve end-to-end observability: Prometheus metrics, Grafana dashboards, Loki/Elastic logs, and PagerDuty alerting that give engineers self-service insight into latency, error budgets, and customer impact.
  • • Lead incident response during your on-call rotation, orchestrating cross-functional war rooms, writing blameless post-mortems, and driving actionable follow-ups that prevent recurrence and reduce MTTR.
  • • Automate CI/CD pipelines (GitHub Actions, Azure DevOps) so every code change is tested, security-scanned, and deployed to production in minutes, not hours, while maintaining strict compliance and rollback safety.
  • • Partner with software, data, and product teams to translate business requirements into scalable architecture—right-sizing compute, tuning JVM/Node runtimes, and optimizing data warehouse queries for sub-second ad-serving SLAs.
  • • Champion SRE culture across the organization: run game-days, establish error budgets, evangelize SLIs/SLOs, and create reusable libraries that let feature teams ship faster without sacrificing reliability.
  • • Continuously evaluate emerging technologies—service meshes, chaos engineering, eBPF, FinOps tooling—and run proof-of-concepts that keep Vantage at the cutting edge of retail-media infrastructure.
  • • Contribute to internal documentation, runbooks, and training sessions so tribal knowledge becomes shared knowledge, enabling every engineer to operate services confidently and independently.

Skills & Technologies

Python
AWS
Azure
GCP
Terraform
Senior
Remote

Ready to Apply?

You will be redirected to an external site to apply.

Vantage Analytics Inc. logo
Vantage Analytics Inc.
Visit Website

About Vantage Analytics Inc.

Vantage is a Toronto-based retail marketing and analytics platform founded in 2013. The company helps omnichannel retailers grow by turning shopper data into automated, high-conversion campaigns across search, display, video, sponsored products, and in-store digital displays. Vantage’s AI layer orchestrates real-time audiences on Meta, Google, Pinterest, Reddit, and emerging channels, while its analytics engine forecasts demand and optimizes media spend. Trusted by thousands of brands in 99 countries and backed by $1.1 million in seed funding, the 90-person team offers done-for-you and SaaS solutions that unify online and offline touchpoints, increase site traffic, and enlarge basket size without extra software overhead.

Similar Opportunities

Essen, USA
Full-time
Expires May 11, 2026
Onsite
Degree Required

3 days ago

Apply
Health Catalyst logo

Health Catalyst

Indiana, USA
Full-time
Expires May 2, 2026
Python
Azure
Terraform
+3 more

12 days ago

Apply
CrossCountry Mortgage, LLC logo

CrossCountry Mortgage, LLC

USA
Full-time
Expires May 11, 2026
Remote
Degree Required

3 days ago

Apply
❌ EXPIRED
Definitive Healthcare Corporation logo

Definitive Healthcare Corporation

Oregon, USA
Full-time
Expired Nov 23, 2025
Remote

6 months ago

Apply