
Job Overview
Location
San Jose Office (Zanker)
Job Type
Full-time
Category
Software Engineering
Date Posted
March 25, 2026
Full Job Description
📋 Description
- • Lambda Inc. is at the forefront of AI cloud infrastructure, providing the computational power necessary for groundbreaking AI research and enterprise solutions. As a leader in this rapidly evolving field, Lambda is dedicated to making superintelligence accessible, aiming to provide compute power as ubiquitously as electricity. This role is critical in building and scaling the high-performance storage infrastructure that underpins our AI cloud offering, ensuring that our tens of thousands of customers have the seamless and powerful resources they need to innovate.
- • As a Storage Engineering Manager, you will be instrumental in shaping the future of AI infrastructure by leading a team of talented storage engineers. Your day-to-day responsibilities will encompass the full lifecycle of high-performance, petabyte-scale storage solutions across multiple datacenters. This includes hiring, mentoring, and guiding your team to build, deploy, and maintain the storage infrastructure that powers Lambda's AI/ML products. You will foster a culture of innovation and technical excellence, driving project priorities, deadlines, and deliverables using Agile methodologies. Your leadership will extend to technical strategy, where you will define the vision for Lambda's distributed storage solutions, manage vendor selection criteria and relationships, and oversee the lifecycle management of storage hardware and software, including upgrades, service, and RMAs. You will also guide your team in optimizing storage pools, sharding, and tiering/caching strategies, and ensure robust multi-tenant security, provisioning, and metering integration. Furthermore, you will lead efforts in problem identification, requirements gathering, solution ideation, and stakeholder alignment on engineering RFCs, while also directly supporting customers with their storage needs.
- • Your role will involve extensive cross-functional collaboration. You will partner closely with the HPC Architecture team to make critical decisions on drive selection, capacity planning, storage networking, cache placement, and rack layouts. Close coordination with storage software and networking teams is essential for executing cross-functional infrastructure initiatives and deploying new data centers, ensuring seamless integration of storage protocols across various on-prem solutions. You will also work hand-in-hand with procurement, data center operations, and fleet engineering teams to deploy storage solutions efficiently into both new and existing facilities. Troubleshooting customer performance, reliability, and data-integrity issues will require close collaboration with vendors. Additionally, you will work with Networking, Compute, and Storage Software Engineering teams to ensure the high-performance distributed storage solutions effectively serve AI/ML workloads, and partner with the fleet engineering team for seamless deployment, monitoring, and maintenance.
- • This position offers a unique opportunity to work at the cutting edge of large-scale distributed systems and the rapidly advancing field of artificial intelligence infrastructure. You will gain invaluable experience in managing and scaling petabyte-scale storage solutions, contributing directly to the foundational infrastructure that powers some of the world's most advanced AI research and products. You will have the chance to influence technical strategy, lead vendor relationships, and drive innovation in storage technologies. The role provides a platform to deepen your expertise in high-performance computing storage, distributed systems, and AI infrastructure, while developing your leadership and people management skills in a fast-paced, high-growth environment. You will be at the forefront of exploring new technologies, such as Nvidia SuperNIC DPUs for storage edge-caching and GPUDirect Storage capabilities, and contribute to uncovering new trends in AI inference and training products that will shape future storage solutions.
- • The Lambda Infrastructure Engineering organization is responsible for forging the foundation of high-performance AI clusters by integrating the latest in AI storage, networking, GPU, and CPU hardware. Our expertise lies at the intersection of high-performance distributed storage solutions and protocols, dynamic networking, and compute virtualization. We engineer the systems that serve massive datasets at the speeds demanded by modern clustered GPUs, design advanced networks for multi-tenant security and intelligent routing, and enable cutting-edge virtualization for AI researchers and engineers. This team is crucial for enabling groundbreaking AI training and inference by ensuring that high-performance networking and storage are as critical as raw compute power.
- • This role is an exceptional chance to make a significant impact on the future of AI by building and managing the core storage infrastructure that enables it. You will be a key technical and operational authority, ensuring our petabyte-scale storage solutions are not only high-performing but also reliable, scalable, and manageable as Lambda grows towards exascale computing. Your leadership will directly contribute to Lambda's mission of making compute as ubiquitous as electricity and empowering everyone with the power of superintelligence, one GPU at a time.
Skills & Technologies
About Lambda Inc.
Lambda Inc. provides cloud-based GPU clusters and workstations for artificial-intelligence research and development. The company designs and operates high-performance hardware infrastructure optimized for machine-learning workloads, offering on-demand access to NVIDIA GPUs, pre-configured deep-learning software stacks, and scalable storage. Customers include AI labs, universities, and enterprises training large language and computer-vision models. Founded in 2012, Lambda is headquartered in San Francisco and maintains data centers across North America and Europe.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.



