
Job Overview
Location
Kansas City, MO - Data Center
Job Type
Full-time
Category
DevOps & SysAdmin
Date Posted
March 23, 2026
Full Job Description
đź“‹ Description
- • As a Data Center Operations System Engineer at Lambda Inc., you will play a critical role in maintaining and scaling the physical infrastructure that powers one of the world’s leading AI cloud platforms, ensuring reliable, high-performance compute for AI researchers, enterprises, and hyperscalers globally.
- • Your work directly supports Lambda’s mission to make superintelligence accessible by enabling the seamless deployment, operation, and optimization of GPU-dense infrastructure in a mission-critical data center environment.
- • You will be responsible for racking, labeling, cabling, and configuring new server, storage, and network infrastructure with precision, adhering to strict installation standards to ensure consistency, safety, and operational efficiency across all Lambda data centers.
- • You will diagnose and resolve complex hardware and software issues in cutting-edge GPU and networking systems, including NVIDIA NVL72 architectures, using systematic troubleshooting methodologies to minimize downtime and maintain peak performance.
- • You will maintain accurate and up-to-date data center documentation by updating layout diagrams, network topologies, and asset records in DCIM (Data Center Infrastructure Management) software, enabling visibility and informed decision-making across teams.
- • You will collaborate closely with supply chain, manufacturing, and hardware support teams to coordinate the end-to-end lifecycle of equipment—from receipt and staging to deployment and handoff—ensuring timely execution of large-scale infrastructure projects.
- • You will manage the parts depot inventory, tracking components through the full lifecycle: delivery, storage, staging, deployment, and handoff, while maintaining optimal stock levels to support rapid response and minimize delays.
- • You will partner with RMA (Return Merchandise Authorization) teams to process faulty hardware returns, coordinate replacements, and ensure timely resolution of equipment failures to sustain operational continuity.
- • You will apply deep knowledge of power distribution systems, including single and three-phase power theory, PDU balancing, and electrical load management to prevent overloads and ensure stable power delivery to dense GPU racks.
- • You will implement and maintain effective airflow management strategies, including hot and cold aisle containment, to optimize cooling efficiency and support the thermal demands of high-density compute environments.
- • You will work with carrier DIA (Dedicated Internet Access) circuits, performing fiber testing, troubleshooting, and turn-up procedures to ensure reliable external connectivity for data center operations.
- • You will develop, refine, and execute Maintenance Operating Procedures (MOPs) for complex infrastructure tasks, collaborating with cross-functional teams to iteratively improve processes, ensure safety, and align operational capabilities with business objectives.
- • You will mentor and train junior technicians on best practices in data center operations, fostering a culture of knowledge sharing, accountability, and technical excellence.
- • You will be prepared to travel as needed to support the bring-up of new data center locations, contributing your expertise to Lambda’s geographic expansion and infrastructure scaling initiatives.
- • You will operate within a 24/7 shift environment, requiring presence on-site five days per week at the Kansas City, MO data center, ensuring continuous monitoring and rapid response to infrastructure needs.
🎯 Requirements
- • Proven experience with critical data center infrastructure systems, including power distribution, environmental monitoring, capacity planning, DCIM software, structured cabling, and cable management.
- • Solid understanding of single and three-phase power theory, PDU balancing, and electrical load management principles.
- • Familiarity with fiber optic cable types, carrier DIA circuit testing and turn-up procedures, and hot/cold aisle containment strategies for thermal management.
🏖️ Benefits
- • Competitive salary with eligibility for overtime pay as a salaried non-exempt role.
- • Comprehensive health, dental, and vision coverage for employees and dependents.
- • 401(k) plan with 2% company match for U.S.-based employees.
- • Flexible paid time off plan designed to support work-life balance and employee well-being.
Skills & Technologies
About Lambda Inc.
Lambda Inc. provides cloud-based GPU clusters and workstations for artificial-intelligence research and development. The company designs and operates high-performance hardware infrastructure optimized for machine-learning workloads, offering on-demand access to NVIDIA GPUs, pre-configured deep-learning software stacks, and scalable storage. Customers include AI labs, universities, and enterprises training large language and computer-vision models. Founded in 2012, Lambda is headquartered in San Francisco and maintains data centers across North America and Europe.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities

Pragmatike Soluciones TecnolĂłgicas S.L.
19 days ago
20 days ago

