
Job Overview
Location
Canada (Remote)
Job Type
Full-time
Category
Software Engineer
Date Posted
March 16, 2026
Full Job Description
đź“‹ Description
- • Join Grafana Labs, a remote-first, open-source powerhouse, as a Senior Software Engineer on the Platform InfraCore squad. This is a unique opportunity to contribute to a globally recognized company with over 20 million users of its flagship visualization tool, Grafana, and a suite of observability products including Grafana Mimir, Loki, and Tempo.
- • As part of the Internal Engineering Platform (IEP) team, you will play a critical role in building and maintaining the foundational infrastructure that empowers application engineers to develop, deploy, and run their workloads efficiently and reliably. This includes managing cloud infrastructure, capacity, security, engineering productivity, monitoring, and compliance.
- • The Platform InfraCore squad specifically focuses on automating the provisioning of Cloud Service Provider (CSP) resources, such as networking, Kubernetes clusters, and other essential resources required by our internal application teams. You will be instrumental in scaling Grafana Cloud's ability to process hundreds of millions of metrics, log lines, and traces per second, ensuring high availability, low latency, and optimal performance.
- • You will be working with a passionate team that values collaboration, transparency, autonomy, and trust. We believe in a holistic approach to development, where engineers own the full lifecycle of their code, from design documentation and developer feedback to integration testing and production operations. This role offers significant opportunities for learning and growth within a dynamic, innovation-driven environment.
- • Your responsibilities will include the end-to-end management of Kubernetes clusters, encompassing their provisioning, lifecycle, scheduling, and autoscaling. You will also be responsible for managing cluster networking components, including load balancing, NAT, DNS, Container Network Interfaces (CNIs), network policies, private connectivity, and cross-cluster communication.
- • A key aspect of this role involves maintaining Crossplane compositions and Terraform modules for common CSP resources, ensuring versioning and compatibility. You will collaborate closely with internal users, primarily Grafana Cloud application teams, to understand their evolving needs and ensure the platform effectively supports their development and operational requirements.
- • As part of a distributed team, strong communication and collaboration skills are paramount. You will contribute to discussions, operate by consensus, and commit to team decisions. The role also involves participation in an on-call rotation to ensure the ongoing health and reliability of the platform, providing valuable insights into system usage and performance.
- • You will contribute to upcoming projects focused on enhancing Kubernetes efficiency through considerations of processor architecture, CSP capacity, machine selection, and multi-AZ utilization. You will also drive automation for Kubernetes cluster lifecycles across multiple CSP regions and mature our Terraform delivery strategy to improve ease of use for internal teams.
- • This position is ideal for an engineer who enjoys working with distributed systems, has a passion for performance and reliability, and thrives in a remote-first, collaborative culture. You will have the opportunity to work with cutting-edge technologies and contribute to the success of a leading open-source company.
- • We encourage candidates to apply even if they don't meet every single requirement, as we value enthusiasm, a willingness to learn, and a strong cultural fit. This is a chance to take on a career-defining opportunity within a company deeply rooted in open-source values and committed to fostering a supportive and innovative work environment.
- • You will be part of a team that actively uses and incorporates Grafana Labs' product suite into its daily operations, providing a unique perspective on system functionality and user experience. This hands-on experience is crucial for building better platforms and contributing to the continuous improvement of our observability solutions.
- • The role requires a proactive approach to problem-solving, a keen eye for detail, and the ability to see the bigger picture. You will be working primarily with Go, Python, and Shell scripting languages, leveraging your expertise to build robust and scalable infrastructure solutions.
- • We are looking for individuals who are comfortable operating their code and understand the full development lifecycle, bridging the gap between development and operations to create superior platforms for our users.
- • Embrace the opportunity to contribute to open-source projects, as OSS is deeply ingrained in Grafana Labs' culture. Your work will directly impact the efficiency, scalability, and reliability of Grafana Cloud, supporting millions of users worldwide.
Skills & Technologies
About Raintank Inc.
Raintank Inc., operating as Grafana Labs, is the open-source company behind the Grafana observability platform. It develops and maintains Grafana dashboards, Loki for logs, Tempo for traces, Mimir for metrics, and Grafana Cloud services, providing scalable monitoring and analytics for DevOps, SRE, and engineering teams worldwide. Grafana Labs supports on-prem and SaaS deployments with enterprise-grade features and commercial support.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities

Web.com Group, Inc.
2 months ago

Ryzlabs Inc.
2 months ago

Anyone AI Inc.
27 days ago

Anyone AI Inc.
27 days ago