This job has expired

This position was posted on November 20, 2025 and is likely no longer accepting applications. We've kept it here for historical reference. Check out the similar jobs below!

SpryPoint Services Inc. logo

Cloud Operations Engineer

Job Overview

Location

Canada

Job Type

Full-time

Category

Software Engineering

Date Posted

November 20, 2025

Full Job Description

đź“‹ Description

  • • Own the health and performance of SpryPoint’s multi-tenant AWS backbone that powers 100+ utilities across North America and the Caribbean. You will provision, tune, and safeguard the environments that run our cloud-native meter-to-cash platform, ensuring 24Ă—7 reliability for millions of end-users.
  • • Respond to and resolve daily Infrastructure requests via Jira Service Management—ranging from spinning up new Elastic Beanstalk environments and whitelisting client IPs to refreshing RDS PostgreSQL databases and configuring Route 53 records—while maintaining a 24-hour SLA and clear audit trail.
  • • Build and maintain resilient, cost-efficient AWS stacks using EC2, ECS Fargate, Aurora Serverless v2, DynamoDB, S3, VPC, and CloudWatch. You’ll right-size instances, implement auto-scaling policies, and enforce least-privilege IAM policies to keep us SOC 2 compliant.
  • • Investigate and remediate performance bottlenecks end-to-end: dive into Linux system metrics, correlate NGINX access logs with PostgreSQL slow-query reports, and leverage observability dashboards to pinpoint root causes before clients feel pain.
  • • Shepherd new utility clients from initial sandbox through mock go-lives to production cut-over. You’ll clone environments, seed test data, validate SSL certificates, and sit on the launch bridge to ensure zero-downtime releases.
  • • Automate the toil away. Write Python and Bash scripts that provision environments, rotate secrets, clean up orphaned snapshots, and tag resources for cost allocation—then store the code in Git and document the runbooks in Confluence so the team scales without heroics.
  • • Participate in weekly maintenance windows for patching, certificate renewals, and minor version upgrades. You’ll craft change tickets, execute blue-green deployments, and run smoke tests to guarantee rollback readiness.
  • • Contribute to incident response: join war-room calls, pull CloudWatch logs, query CloudTrail events, and draft post-mortems that turn outages into actionable backlog items. Your calm, data-driven approach will shorten MTTR and build client trust.
  • • Continuously improve monitoring and alerting. Tune thresholds, add SLOs, and integrate PagerDuty so the team moves from reactive firefighting to proactive reliability engineering.
  • • Collaborate cross-functionally with Service Delivery, Product Engineering, QA, and Security. You’ll translate technical findings into business impact for project managers and occasionally hop on client calls to explain environment status in plain English.
  • • Leverage modern AI assistants to accelerate troubleshooting, auto-generate documentation, and identify cost anomalies. You’re encouraged to experiment with generative AI to make our operations smarter, faster, and more human-friendly.
  • • Champion security best practices: enforce MFA, rotate IAM keys, patch AMIs, and validate backups. You’ll also rehearse disaster-recovery scenarios twice a year to guarantee we can restore any client environment within 30 minutes.
  • • Document tribal knowledge in Confluence: architecture diagrams, environment maps, escalation playbooks, and “how we fixed it last time” notes. Your clarity will onboard new engineers in days, not weeks.
  • • Track every change, task, and hour in Jira for SOC 2 traceability. You’ll keep boards clean, labels consistent, and sprint retros honest so we continuously improve velocity and quality.
  • • Bring curiosity and kindness to a remote-first culture that values bold disruption. Whether you’re pairing over Slack huddles or presenting a lunch-and-learn on Terraform tricks, your voice will shape the future of utility technology.

Skills & Technologies

Python
PostgreSQL
DynamoDB
AWS
Linux
Remote

Ready to Apply?

You will be redirected to an external site to apply.

SpryPoint Services Inc. logo
SpryPoint Services Inc.
Visit Website

About SpryPoint Services Inc.

SpryPoint Services Inc. provides a new generation of customer service and operations software, empowering utilities to contend with the industry’s rapid pace of change. Their platform streamlines meter-to-cash processes, reducing friction from field operations to back-office tasks. Serving utilities across the Americas, SpryPoint offers solutions for water, sewer, electric, and gas sectors. The company's platform includes an omnichannel customer portal and integrated payment processing. SpryPoint's dedication to modernizing CIS systems is reflected in their recognition as a 2025 Deloitte Technology Fast 500™ company in North America, highlighting their growth and innovation.

Similar Opportunities

Istanbul, Turkiye
Full-time
Expires Mar 1, 2026
Go
Grafana
Senior
+1 more

2 months ago

Apply
Veritas Veterinary Partners logo

Veritas Veterinary Partners

Remote
Full-time
Expires Feb 28, 2026
Senior
Remote

2 months ago

Apply
❌ EXPIRED
London
Full-time
Expired Jan 1, 2026
Remote

4 months ago

Apply
Faith Technologies, Inc. logo

Faith Technologies, Inc.

Menasha-OMC
Full-time
Expires Mar 4, 2026
Go
Onsite
Degree Required

1 month ago

Apply