Lambda Inc. logo

Senior Software Engineer - Infrastructure Storage

Job Overview

Location

San Francisco Office (Fremont St)

Job Type

Full-time

Category

Software Engineering

Date Posted

June 26, 2026

Full Job Description

đź“‹ Description

  • • Design, develop, and maintain high-performance storage software for next-generation on-premise AI storage solutions, ensuring scalability, reliability, and performance across distributed systems.
  • • Implement and optimize storage protocol APIs for file (NFS, SMB, Lustre), block (iSCSI, Fibre Channel), and object (S3, Swift) access to support diverse AI workloads.
  • • Build distributed systems that orchestrate storage resources across redundant arrays and heterogeneous storage hardware, including NVMe and GPU-direct storage.
  • • Collaborate closely with hardware and system architects to integrate software with storage hardware, ensuring seamless interoperability and maximum throughput for AI training and inference clusters.
  • • Troubleshoot and resolve complex production issues in data center environments, focusing on storage latency, data consistency, and system resilience.
  • • Contribute to the full software development lifecycle, from requirements gathering and system design to deployment, monitoring, and maintenance of storage services.
  • • Develop software that enables petabyte-scale data access at speeds required by clustered GPUs, ensuring low-latency, high-throughput data pipelines for AI researchers and enterprises.
  • • Work within a team responsible for the foundational infrastructure of Lambda’s AI cloud, where storage performance directly impacts the efficiency of distributed AI training and inference.
  • • Participate in code reviews, architectural discussions, and on-call rotations to ensure system stability and operational excellence in a 24/7 production environment.
  • • Apply deep knowledge of Linux kernel internals and system-level programming to enhance storage subsystem performance and reliability.
  • • Integrate storage solutions with containerized environments using Docker and Kubernetes, ensuring production-grade deployment and orchestration of storage services.
  • • Adhere to CI/CD and QA practices tailored for distributed systems, ensuring robust testing and automated deployment pipelines for storage software.
  • • Maintain and evolve storage infrastructure to support evolving AI workloads, including multi-tenant security, intelligent routing, and dynamic resource allocation.
  • • Ensure storage systems are aligned with Lambda’s mission to make compute as ubiquitous as electricity by enabling fast, reliable, and accessible data for every GPU in the cluster.

🎯 Requirements

  • • Bachelor's or Master's degree in Computer Science or a related field
  • • 5+ years of experience in software development for storage systems
  • • Proven experience with distributed systems programming, including load balancing, data durability, consensus algorithms, fault tolerance, and data consistency
  • • Strong programming skills in C, C++, Go, or Python
  • • Deep understanding of storage protocols: NFS, SMB, Lustre (file); iSCSI, Fibre Channel (block); S3, Swift (object)
  • • Experience with Linux kernel internals and system-level programming

🏖️ Benefits

  • • Health, dental, and vision coverage for you and your dependents
  • • Wellness and commuter stipends for select roles
  • • 401k Plan with 2% company match (USA employees)
  • • Flexible paid time off plan that we all actually use

Skills & Technologies

Python
Swift
Docker
Kubernetes
Linux
DevOps
Senior
Onsite
Degree Required

Ready to Apply?

You will be redirected to an external site to apply.

AI Job Fit Analysis
Pro

See exactly how your profile matches this role — strengths, skill gaps, and what to do about them.

Lambda Inc. logo
Lambda Inc.
Visit Website

About Lambda Inc.

Lambda Inc. provides cloud-based GPU clusters and workstations for artificial-intelligence research and development. The company designs and operates high-performance hardware infrastructure optimized for machine-learning workloads, offering on-demand access to NVIDIA GPUs, pre-configured deep-learning software stacks, and scalable storage. Customers include AI labs, universities, and enterprises training large language and computer-vision models. Founded in 2012, Lambda is headquartered in San Francisco and maintains data centers across North America and Europe.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

Expired
Remote LATAM
Full-time
Expired May 16, 2026
AWS
Azure
GCP
+3 more

3 months ago

Expired
US - Remote
Full-time
Expired May 16, 2026
Remote
Degree Required

3 months ago

Expired
Stedi, Inc. logo

Stedi, Inc.

Remote in the USA
Full-time
Expired May 6, 2026
REST
Remote

4 months ago

Expired
Remote LATAM
Full-time
Expired May 16, 2026
AWS
Azure
GCP
+3 more

3 months ago