SouthGeek S.A. logo

Senior SRE

Job Overview

Location

Remote

Job Type

Full-time

Category

Software Engineering

Date Posted

May 16, 2026

Full Job Description

đź“‹ Description

  • • Serve as a Senior Site Reliability Engineer for one of the world’s largest genealogy and family history platforms, ensuring the reliability, availability, and performance of infrastructure supporting billions of historical records and a global user base.
  • • Take full ownership of Dynatrace-based observability systems, designing and implementing automated configurations to monitor and maintain system health at massive scale.
  • • Develop and maintain TypeScript-based tooling to automate routine operational tasks, improve incident response workflows, and enhance platform reliability across distributed systems.
  • • Consume and integrate Dynatrace REST APIs to extract metrics, trigger alerts, and automate remediation processes in alignment with SRE principles.
  • • Collaborate with a mature engineering team to drive automation initiatives that reduce toil, increase system resilience, and support continuous delivery under high-load conditions.
  • • Participate in on-call rotations and incident response protocols, applying deep understanding of SRE practices to minimize downtime and improve mean time to resolution (MTTR).
  • • Design and implement observability strategies that provide actionable insights across microservices, databases, and distributed architectures serving millions of concurrent users.
  • • Build and maintain automated configuration management systems using Dynatrace to enforce consistent monitoring standards across environments.
  • • Work closely with development teams to embed reliability practices into the software development lifecycle, including service level objectives (SLOs), error budgets, and alerting thresholds.
  • • Maintain documentation of automation scripts, observability dashboards, and incident runbooks to ensure knowledge sharing and operational continuity.
  • • Support infrastructure scalability initiatives by identifying performance bottlenecks and recommending automation-driven solutions to handle increasing data ingestion and user traffic.
  • • Act as a technical advocate for reliability best practices across engineering teams, promoting proactive monitoring, automated testing, and preventive maintenance.
  • • Translate complex system behaviors into clear metrics and reports for cross-functional stakeholders, enabling data-driven decisions on system investments and risk mitigation.
  • • Contribute to the evolution of the platform’s SRE roadmap by evaluating new tools, automating manual processes, and proposing improvements to incident management frameworks.
  • • Remain agile in a fast-paced engineering environment with high ownership expectations, where system failures have direct impact on user trust and historical data integrity.
  • • Engage in continuous learning and knowledge transfer to strengthen team capabilities in observability, automation, and large-scale system reliability.

🎯 Requirements

  • • Solid experience automating Dynatrace or Datadog configuration at scale
  • • Strong hands-on experience consuming REST APIs, particularly Dynatrace APIs
  • • Proficiency in TypeScript for tooling and automation development
  • • Strong understanding of SRE principles: reliability, observability, incident response, and automation
  • • Experience working in fast-paced engineering environments with high ownership expectations
  • • Experience with AWS Lambda for serverless automation workflows

🏖️ Benefits

  • • 100% remote work
  • • Payment in USD
  • • Paid Time Off (PTO)
  • • Work-from-home and training reimbursement
  • • English lessons
  • • Technical training

Skills & Technologies

TypeScript
AWS
REST
Datadog
DevOps
Senior
Remote
Degree Required

Ready to Apply?

You will be redirected to an external site to apply.

SouthGeek S.A. logo
SouthGeek S.A.
Visit Website

About SouthGeek S.A.

SouthGeek is an Argentine software development company specializing in scalable web and mobile applications for startups and enterprises. Founded in 2014, the firm offers full-stack engineering, cloud architecture, UX/UI design, and dedicated agile teams. It focuses on fintech, healthcare, and logistics projects across Latin America and the United States, emphasizing clean code, automated testing, and continuous delivery. The company operates remotely from CĂłrdoba, Buenos Aires, and Montevideo, integrating regional talent with global clients to accelerate digital transformation and reduce time-to-market for complex products.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

London
Full-time
Expires Jun 10, 2026
Go
Senior
Onsite

2 months ago

Apply
Addi S.A.S. logo

Addi S.A.S.

Colombia
Full-time
Expires Jul 15, 2026
Python
Go
Hybrid
+1 more

14 days ago

Apply
Expired
Great Place to Work UK logo

Great Place to Work UK

Sydney, New South Wales, Australia
Full-time
Expired Apr 25, 2026
Onsite

3 months ago

Apply
Expired
Remote
Full-time
Expired Apr 13, 2026
Remote

4 months ago

Apply