Lambda Inc. logo

Staff Storage Systems Architect

Job Overview

Location

San Jose Office (Zanker)

Job Type

Full-time

Category

Backend Engineer

Date Posted

March 24, 2026

Full Job Description

đź“‹ Description

  • • As a Staff Storage Systems Architect at Lambda Inc., you will play a critical role in shaping the foundation of the company’s AI cloud infrastructure by designing and implementing scalable, high-performance storage systems that power AI workloads for researchers, enterprises, and hyperscalers worldwide.
  • • Your work will directly enable Lambda’s mission to make compute as ubiquitous as electricity by ensuring that storage solutions are reliable, efficient, and optimized for the demanding performance needs of modern AI applications.
  • • Day to day, you will architect, design, and implement distributed storage solutions tailored for AI workloads, leveraging technologies like Ceph, Lustre, and other enterprise-grade systems to meet stringent performance, reliability, and scalability requirements.
  • • You will drive the development of storage system standards and best practices across the organization, ensuring consistency, maintainability, and operational excellence in all storage deployments.
  • • You will evaluate and benchmark emerging storage technologies against Lambda’s AI performance demands, providing data-driven recommendations for technology adoption and infrastructure evolution.
  • • You will collaborate closely with engineering, product, and infrastructure operations teams to integrate storage solutions seamlessly into Lambda’s cloud product offerings, aligning technical execution with business objectives.
  • • You will define and maintain comprehensive storage capacity plans, performance metrics, and scalability roadmaps, forecasting future needs and guiding long-term infrastructure investment.
  • • You will lead efforts to ensure high availability, disaster recovery, and data integrity across storage systems, implementing robust redundancy, failover, and data protection strategies.
  • • You will provide technical mentorship and leadership to storage and infrastructure engineers, fostering a culture of excellence, knowledge sharing, and continuous improvement within the team.
  • • You will work in a hybrid environment at Lambda’s San Jose office (Zanker) with four days of in-office presence per week, collaborating directly with cross-functional teams in a fast-paced, innovation-driven setting.
  • • About the team and company: Lambda Inc. is a leader in AI cloud infrastructure, serving tens of thousands of customers from AI researchers to hyperscalers, backed by prominent investors including NVIDIA, ARK Invest, and In-Q-Tel, and recognized for research excellence at top conferences like NeurIPS and SIGGRAPH.
  • • The company values innovation, technical depth, and impact, offering a culture where engineers are empowered to solve hard problems at the intersection of systems, storage, and AI.
  • • In this role, you will have the opportunity to architect storage systems at unprecedented scale for AI workloads, gaining deep expertise in the storage challenges unique to modern AI infrastructure and positioning yourself as a technical leader in the AI infrastructure space.
  • • You will achieve measurable impact by directly influencing the performance, reliability, and cost-efficiency of Lambda’s storage platform, enabling faster model training, lower latency inference, and greater accessibility to superintelligence-powered compute.

🎯 Requirements

  • • Proven experience (7+ years) designing and implementing distributed storage systems in production environments.
  • • Deep expertise with distributed storage technologies such as Ceph, Lustre, or comparable enterprise-scale systems.
  • • Strong understanding of storage architectures, including data replication, redundancy strategies, and performance optimization techniques for high-throughput, low-latency workloads.
  • • Experience working with high-performance and high-availability storage solutions in demanding environments.
  • • Familiarity with object, block, and file storage protocols and their appropriate use cases.
  • • Ability to identify, analyze, and resolve complex technical issues across hardware, software, and network layers.
  • • Excellent communication skills, with the ability to clearly articulate technical concepts to both technical and non-technical stakeholders.

🏖️ Benefits

  • • Generous cash and equity compensation package tied to performance and market benchmarks.
  • • Comprehensive health, dental, and vision coverage for employees and their dependents.
  • • Wellness and commuter stipends for eligible roles to support employee well-being and sustainable transportation.
  • • 401(k) plan with 2% company match for U.S.-based employees.
  • • Flexible paid time off plan that encourages actual usage and work-life balance.
  • • Opportunity to work on cutting-edge AI infrastructure with access to advanced hardware and research-driven technology stacks.

Skills & Technologies

AWS
Azure
GCP
Senior
Onsite

Ready to Apply?

You will be redirected to an external site to apply.

Lambda Inc. logo
Lambda Inc.
Visit Website

About Lambda Inc.

Lambda Inc. provides cloud-based GPU clusters and workstations for artificial-intelligence research and development. The company designs and operates high-performance hardware infrastructure optimized for machine-learning workloads, offering on-demand access to NVIDIA GPUs, pre-configured deep-learning software stacks, and scalable storage. Customers include AI labs, universities, and enterprises training large language and computer-vision models. Founded in 2012, Lambda is headquartered in San Francisco and maintains data centers across North America and Europe.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

Yerevan, Armenia
Full-time
Expires Jun 4, 2026
Go
Rust
Ruby
+5 more

20 days ago

Apply
Argentina - Remote
Full-time
Expires Jun 21, 2026
TypeScript
Scala
React
+4 more

4 days ago

Apply
Argentina
Full-time
Expires May 12, 2026
Java
Remote

1 month ago

Apply
Argentina
Full-time
Expires May 20, 2026
JavaScript
TypeScript
React
+5 more

1 month ago

Apply