This job has expired

This position was posted on October 24, 2025 and is likely no longer accepting applications. We've kept it here for historical reference. Check out the similar jobs below!

Doran Jones Inc. logo

Service Level Availability Manager

Job Overview

Location

United States

Job Type

Full-time

Category

Data Science

Date Posted

October 24, 2025

Full Job Description

📋 Description

  • Own the complete lifecycle of Service Level and Availability Management for a global enterprise that spans on-premises data centers, multi-cloud environments (AWS, Azure), and third-party SaaS platforms. You will be the single point of accountability for ensuring 24×7 service resilience, translating business risk into technical action plans and measurable outcomes.
  • Architect, document, and continuously refine availability strategies that map directly to business-critical processes such as trading, risk calculations, regulatory reporting, and client onboarding. Your plans will balance cost, complexity, and risk, and will be reviewed quarterly with senior technology and business leadership.
  • Instrument end-to-end observability using PowerBI dashboards, Datadog synthetic and real-user monitoring, Splunk log analytics, PagerDuty on-call orchestration, and ServiceNow ITSM workflows. You will define SLIs, SLOs, and error budgets for every tier-1 and tier-2 service, then socialize them across engineering squads and executive stakeholders.
  • Embed reliability engineering practices into every stage of the SDLC. Partner with DevOps teams to require availability runbooks in pull requests, with Infrastructure to enforce chaos-engineering game days, and with Application owners to model failure modes during design reviews. You will act as the gatekeeper for production readiness checklists before any major release.
  • Lead blameless post-incident reviews, translating outages into concrete corrective actions and longer-term systemic fixes. Track MTTR, MTTI, and availability trends; publish monthly reliability scorecards that highlight leading indicators before they become customer-impacting issues.
  • Drive a culture shift from reactive firefighting to proactive resilience. Facilitate workshops, brown-bags, and lunch-and-learns on topics such as SRE principles, error budgets, and cost-of-downtime calculations. Mentor junior engineers and non-technical product managers on how to think probabilistically about risk.
  • Collaborate with Cybersecurity, Network, Database, and Cloud FinOps teams to ensure that availability initiatives do not compromise security posture or blow up cloud spend. You will co-own the disaster-recovery playbook and conduct semi-annual failover drills that simulate region-wide outages.
  • Present weekly availability briefings to the CTO and quarterly business reviews to the COO. Translate technical metrics into financial impact—downtime cost per minute, revenue at risk, regulatory fines—so that investment decisions in redundancy or automation are data-driven and transparent.
  • Champion Doran Jones’ diversity mission by mentoring candidates from non-traditional backgrounds who are transitioning into technology careers. Contribute interview questions and host mock incident-response sessions for veterans and underserved-community graduates entering our apprenticeship program.

Skills & Technologies

AWS
Azure
Splunk
Datadog
Remote

Ready to Apply?

You will be redirected to an external site to apply.

AI Job Fit Analysis
Pro

See exactly how your profile matches this role — strengths, skill gaps, and what to do about them.

Doran Jones Inc. logo
Doran Jones Inc.
Visit Website

About Doran Jones Inc.

Doran Jones Inc. is a New York-based professional services firm specializing in data engineering, analytics, and software development for financial institutions. Founded in 2013, the company delivers cloud-native data architectures, regulatory reporting solutions, and DevOps transformation programs. Its engineering teams embed with clients to modernize legacy systems, implement real-time analytics, and accelerate digital transformation initiatives. Core expertise spans Python, Scala, Java, and cloud platforms including AWS and Azure, with a focus on agile delivery and scalable data governance frameworks.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

São Paulo
Full-time
Expires Jul 27, 2026
Python
GCP
Senior
+1 more

22 days ago

Expired
USA
Full-time
Expired May 11, 2026
Linux
Remote

3 months ago

Expired
Mumbai
Full-time
Expired Dec 26, 2025
Onsite
Hybrid

8 months ago

Expired
Remote - Canada
Full-time
Expired May 2, 2026
Remote

4 months ago