
Job Overview
Location
Tokyo
Job Type
Full-time
Category
Product Management
Date Posted
June 3, 2026
Full Job Description
đź“‹ Description
- • Design, build, and maintain scalable, resilient infrastructure that supports Kraken’s AI-driven energy management platform serving millions of customers globally.
- • Ensure high availability, performance, and scalability of core platform products by implementing robust monitoring, alerting, and incident response systems.
- • Collaborate with product and engineering teams to identify reliability bottlenecks and implement proactive solutions that prevent system degradation and outages.
- • Architect and automate infrastructure components using modern DevOps practices, including CI/CD pipelines, infrastructure-as-code, and configuration management tools.
- • Optimize system performance through capacity planning, load testing, and performance tuning across distributed systems and cloud-native environments.
- • Lead post-mortem analyses for incidents, document root causes, and drive implementation of preventive measures to reduce recurrence.
- • Integrate reliability best practices into the product development lifecycle, including defining SLOs, SLIs, and error budgets for critical services.
- • Work closely with global teams to standardize reliability practices across regions and ensure consistent platform behavior under varying operational conditions.
- • Maintain and improve observability tooling including logging, metrics collection, tracing, and dashboards to provide actionable insights into system health.
- • Contribute to the evolution of Kraken’s Customer Information System (CIS), billing, meter data management, CRM, and AI-driven communication modules by ensuring their operational reliability.
- • Stay current with industry trends in platform reliability, cloud architecture, and energy technology innovations to continuously elevate platform standards.
- • Communicate complex technical issues clearly to both technical and non-technical stakeholders to align on reliability priorities and impact.
- • Participate in on-call rotations to respond to critical system incidents, ensuring rapid resolution and minimal customer impact.
- • Champion a culture of ownership and continuous improvement across the Product Reliability team by mentoring engineers and sharing knowledge.
- • Support the transition from legacy systems to modern, scalable architectures while maintaining service continuity and data integrity.
- • Apply deep understanding of distributed systems, microservices, and cloud platforms to enhance fault tolerance and reduce mean time to recovery (MTTR).
🎯 Requirements
- • Proven experience as a Platform Engineer, Site Reliability Engineer, or similar role in a high-scale, production environment
- • Strong proficiency in cloud platforms (AWS, Azure, or GCP) and containerization technologies (Docker, Kubernetes)
- • Hands-on experience with infrastructure-as-code tools (Terraform, Ansible, or similar)
- • Expertise in monitoring and observability tools (Prometheus, Grafana, Datadog, ELK stack, or equivalent)
- • Solid understanding of distributed systems, microservices architecture, and network protocols
- • Experience defining and managing SLOs/SLIs and implementing error budget policies
🏖️ Benefits
- • Opportunity to work on cutting-edge technology shaping the future of global energy systems
- • Flexible work arrangements within Japan (Tokyo-based or remote)
- • Collaborative, mission-driven culture focused on sustainability and innovation
- • Exposure to international teams and global impact in the energy technology sector
Skills & Technologies
See exactly how your profile matches this role — strengths, skill gaps, and what to do about them.
About Kraken Technologies Limited
Kraken builds the Kraken platform — a cloud-native, AI-powered operating system for utilities and energy companies that automates the energy supply chain (customer lifecycle, billing/CRM, trading and asset optimisation, migration from legacy systems) to enable faster product innovation, lower operating costs, and support distributed/renewable energy use.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities

Syneos Health, Inc.
1 month ago

GE HealthCare Technologies Inc.
1 month ago

Remote Technologies Inc.
2 months ago

The Pennant Group, Inc.
8 months ago