
Job Overview
Location
UK ONLY
Job Type
Full-time
Category
Software Engineering
Date Posted
February 16, 2026
Full Job Description
đź“‹ Description
- • Join Lyrebird Health as a Staff Site Reliability Engineer (SRE) and play a pivotal role in elevating the reliability, scalability, and security of our cutting-edge platform. This is a senior-level position offering significant impact, where you will be instrumental in designing, evolving, and maintaining the robust systems and best practices that ensure Lyrebird operates with exceptional speed, safety, and availability.
- • Your responsibilities will span a broad spectrum of critical areas, including infrastructure management, application reliability engineering, comprehensive observability strategies, efficient incident response protocols, and empowering our engineering teams through platform enablement. You will foster close, collaborative partnerships with our Engineering, Security, and Product departments, ensuring a unified approach to operational excellence.
- • This is far from a passive, maintenance-focused role. You will be a proactive driver of substantial improvements in how we architect, deploy, and operate our services in production environments. We empower our SREs with genuine autonomy and ownership, enabling you to make meaningful contributions and shape the future of our platform's operational integrity.
- • At Lyrebird Health, we are dedicated to revolutionizing healthcare by automating the most time-consuming tasks for clinicians. Our platform is already trusted by thousands of healthcare professionals across a diverse range of disciplines, and our user base is expanding rapidly. The trust our users place in us to deliver a fast, reliable, and secure experience is paramount. We are committed to earning and maintaining this trust while continuously exceeding their expectations.
- • As a Staff SRE, you will be at the forefront of ensuring this trust is upheld. You will contribute to the design and implementation of resilient infrastructure, leveraging cloud-native technologies and infrastructure-as-code principles to build scalable and cost-effective solutions. Your expertise will be crucial in defining and enforcing SLOs (Service Level Objectives) and SLIs (Service Level Indicators) to measure and maintain the health of our services.
- • You will champion a culture of reliability by embedding SRE principles into the software development lifecycle. This includes collaborating with development teams to identify and mitigate potential risks, conducting thorough performance testing, and advocating for best practices in coding and system design that prioritize stability and maintainability.
- • A significant part of your role will involve building and refining our observability stack. This means ensuring we have robust monitoring, logging, and tracing capabilities in place to provide deep insights into system behavior, enabling rapid detection and diagnosis of issues. You will be responsible for developing dashboards, alerts, and runbooks that empower our teams to understand and manage the platform effectively.
- • Incident response will be a key area of focus. You will participate in on-call rotations, lead incident management efforts, conduct blameless post-mortems, and implement preventative measures to minimize future disruptions. Your ability to remain calm under pressure and drive swift, effective resolutions will be vital.
- • You will also be involved in security initiatives, working alongside the security team to implement and maintain security best practices across our infrastructure and applications. This includes vulnerability management, access control, and ensuring compliance with relevant regulations.
- • Furthermore, you will contribute to platform enablement, developing tools, automation, and documentation that empower other engineering teams to deploy and manage their services more efficiently and reliably. This could involve building CI/CD pipelines, creating self-service infrastructure tools, or developing internal platforms that streamline common operational tasks.
- • We are looking for an individual who is passionate about automation, possesses a deep understanding of distributed systems, and thrives in a collaborative, fast-paced environment. Your ability to think critically, solve complex problems, and communicate effectively will be essential for success in this role.
- • This is an opportunity to make a tangible difference in the healthcare industry, directly impacting the lives of clinicians and patients by ensuring the seamless operation of a critical technology platform. You will have the chance to work with a talented and dedicated team, tackle challenging technical problems, and grow your career in a supportive and innovative company.
- • You will be a key voice in technical decision-making, influencing architectural choices and driving the adoption of new technologies and methodologies that enhance our platform's resilience and performance. Your strategic thinking and proactive approach will be highly valued.
- • We believe in continuous learning and improvement, and you will be encouraged to explore new tools, techniques, and approaches to SRE, sharing your knowledge and insights with the wider team. Your contributions will help shape the future of SRE at Lyrebird Health.
- • Ultimately, your success in this role will be measured by the tangible improvements you bring to our platform's reliability, scalability, and security, directly contributing to Lyrebird Health's mission of transforming healthcare delivery.
🎯 Requirements
- • Proven experience as a Site Reliability Engineer, DevOps Engineer, or in a similar role with a strong focus on system reliability and scalability.
- • Deep understanding of cloud platforms (e.g., AWS, GCP, Azure) and experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation).
- • Proficiency in at least one programming language (e.g., Python, Go, Java) for automation and tooling.
- • Experience with containerization technologies (e.g., Docker, Kubernetes) and orchestration.
- • Strong knowledge of monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, ELK stack, Datadog).
- • Experience with CI/CD pipelines and tools (e.g., Jenkins, GitLab CI, CircleCI).
- • Familiarity with database technologies (SQL and NoSQL) and their operational aspects.
- • Excellent problem-solving and debugging skills, with a methodical approach to incident management.
- • Strong communication and collaboration skills, with the ability to work effectively with cross-functional teams.
- • Experience with distributed systems and microservices architectures.
- • Understanding of network protocols and security best practices.
- • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
🏖️ Benefits
- • Competitive salary and equity package.
- • Comprehensive health, dental, and vision insurance.
- • Generous paid time off and holidays.
- • Opportunities for professional development and continuous learning.
- • Flexible working arrangements (UK ONLY).
- • Collaborative and innovative work environment.
- • Direct impact on transforming healthcare.
Skills & Technologies
About Lyrebirdhealth
Lyrebirdhealth is a digital health company focused on improving the efficiency and accuracy of clinical documentation. They offer an AI-powered platform that listens to patient-clinician conversations and automatically generates comprehensive clinical notes. This technology aims to reduce the administrative burden on healthcare professionals, allowing them to spend more time with patients and less time on paperwork. Their solution integrates seamlessly into existing workflows, providing real-time transcription and summarization. By leveraging advanced speech recognition and natural language processing, Lyrebirdhealth seeks to enhance the quality of care and streamline healthcare operations. The company is dedicated to transforming the way medical records are created and managed.
Similar Opportunities

InsiderOne LLC
2 months ago

Faith Technologies, Inc.
2 months ago

