Raintank Inc. logo

Software Engineer - Platform InfraCore | Canada | Remote

Job Overview

Location

Canada (Remote)

Job Type

Full-time

Category

Software Engineer

Date Posted

March 16, 2026

Full Job Description

đź“‹ Description

  • • Join Grafana Labs, a remote-first, open-source powerhouse, as a Senior Software Engineer on the Platform InfraCore squad. This is a unique opportunity to contribute to a globally recognized company with over 20 million users of its flagship visualization tool, Grafana, and a suite of observability products including Grafana Mimir, Loki, and Tempo.
  • • As part of the Internal Engineering Platform (IEP) team, you will play a critical role in building and maintaining the foundational infrastructure that empowers application engineers to develop, deploy, and run their workloads efficiently and reliably. This includes managing cloud infrastructure, capacity, security, engineering productivity, monitoring, and compliance.
  • • The Platform InfraCore squad specifically focuses on automating the provisioning of Cloud Service Provider (CSP) resources, such as networking, Kubernetes clusters, and other essential resources required by our internal application teams. You will be instrumental in scaling Grafana Cloud's ability to process hundreds of millions of metrics, log lines, and traces per second, ensuring high availability, low latency, and optimal performance.
  • • You will be working with a passionate team that values collaboration, transparency, autonomy, and trust. We believe in a holistic approach to development, where engineers own the full lifecycle of their code, from design documentation and developer feedback to integration testing and production operations. This role offers significant opportunities for learning and growth within a dynamic, innovation-driven environment.
  • • Your responsibilities will include the end-to-end management of Kubernetes clusters, encompassing their provisioning, lifecycle, scheduling, and autoscaling. You will also be responsible for managing cluster networking components, including load balancing, NAT, DNS, Container Network Interfaces (CNIs), network policies, private connectivity, and cross-cluster communication.
  • • A key aspect of this role involves maintaining Crossplane compositions and Terraform modules for common CSP resources, ensuring versioning and compatibility. You will collaborate closely with internal users, primarily Grafana Cloud application teams, to understand their evolving needs and ensure the platform effectively supports their development and operational requirements.
  • • As part of a distributed team, strong communication and collaboration skills are paramount. You will contribute to discussions, operate by consensus, and commit to team decisions. The role also involves participation in an on-call rotation to ensure the ongoing health and reliability of the platform, providing valuable insights into system usage and performance.
  • • You will contribute to upcoming projects focused on enhancing Kubernetes efficiency through considerations of processor architecture, CSP capacity, machine selection, and multi-AZ utilization. You will also drive automation for Kubernetes cluster lifecycles across multiple CSP regions and mature our Terraform delivery strategy to improve ease of use for internal teams.
  • • This position is ideal for an engineer who enjoys working with distributed systems, has a passion for performance and reliability, and thrives in a remote-first, collaborative culture. You will have the opportunity to work with cutting-edge technologies and contribute to the success of a leading open-source company.
  • • We encourage candidates to apply even if they don't meet every single requirement, as we value enthusiasm, a willingness to learn, and a strong cultural fit. This is a chance to take on a career-defining opportunity within a company deeply rooted in open-source values and committed to fostering a supportive and innovative work environment.
  • • You will be part of a team that actively uses and incorporates Grafana Labs' product suite into its daily operations, providing a unique perspective on system functionality and user experience. This hands-on experience is crucial for building better platforms and contributing to the continuous improvement of our observability solutions.
  • • The role requires a proactive approach to problem-solving, a keen eye for detail, and the ability to see the bigger picture. You will be working primarily with Go, Python, and Shell scripting languages, leveraging your expertise to build robust and scalable infrastructure solutions.
  • • We are looking for individuals who are comfortable operating their code and understand the full development lifecycle, bridging the gap between development and operations to create superior platforms for our users.
  • • Embrace the opportunity to contribute to open-source projects, as OSS is deeply ingrained in Grafana Labs' culture. Your work will directly impact the efficiency, scalability, and reliability of Grafana Cloud, supporting millions of users worldwide.

Skills & Technologies

Python
AWS
Azure
GCP
Kubernetes
Remote

Ready to Apply?

You will be redirected to an external site to apply.

Raintank Inc. logo
Raintank Inc.
Visit Website

About Raintank Inc.

Raintank Inc., operating as Grafana Labs, is the open-source company behind the Grafana observability platform. It develops and maintains Grafana dashboards, Loki for logs, Tempo for traces, Mimir for metrics, and Grafana Cloud services, providing scalable monitoring and analytics for DevOps, SRE, and engineering teams worldwide. Grafana Labs supports on-prem and SaaS deployments with enterprise-grade features and commercial support.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

Argentina - Remote
Full-time
Expires May 4, 2026
Python
PHP
Ruby
+5 more

2 months ago

Apply
⏰ EXPIRES SOON
Argentina
Full-time
Expires Apr 25, 2026 (Soon)
Python
JavaScript
TypeScript
+4 more

2 months ago

Apply
Colombia - Fully Remote
Full-time
Expires May 24, 2026
Python
JavaScript
TypeScript
+3 more

27 days ago

Apply
Colombia - Fully Remote
Part-time
Expires May 24, 2026
Python
JavaScript
TypeScript
+3 more

27 days ago

Apply