
Job Overview
Location
London, United Kingdom
Job Type
Full-time
Category
Engineering
Date Posted
April 14, 2026
Full Job Description
đź“‹ Description
- • The Senior ML Infrastructure Engineer will extend and operate the infrastructure that powers PhysicsX’s research model training, fine-tuning, and serving pipelines, enabling engineers to train Large Physics Models efficiently and reliably at scale.
- • Day to day, the role involves designing and operating distributed training infrastructure on NVIDIA DGX B200 platforms, optimizing data I/O for large-scale mesh datasets, building model serving systems with uncertainty quantification, and improving developer experience through CI/CD, debugging tools, and experiment tracking.
- • The role is vertically embedded within the Research team, collaborating daily with Research Scientists, ML Engineers, and Simulation Data Engineers, while also being part of a horizontal infrastructure engineering group responsible for company-wide infrastructure patterns and standards.
- • The person in this role will gain deep expertise in scaling neural operator architectures, optimizing HPC-grade ML pipelines, and deploying reproducible AI systems in advanced engineering industries such as Aerospace, Energy, and Semiconductors, with direct impact on hardware innovation at the speed of software.
🎯 Requirements
- • 5+ years of experience building and operating ML infrastructure at scale, including deep expertise in distributed training (debugging NCCL hangs, optimizing collective communication, FSDP/DDP/pipeline parallelism)
- • Strong systems fundamentals: Linux, networking (NVLink, InfiniBand), storage I/O, profiling, and performance optimization
- • Production experience with Kubernetes and SLURM for job orchestration on GPU clusters, proficiency in Python and PyTorch, and experience with cloud GPU infrastructure (CoreWeave or similar)
🏖️ Benefits
- • Equity options – share in our success and growth
- • 10% employer pension contribution – invest in your future
- • Free office lunches – great food to fuel your workdays
- • Flexible and hybrid working – balance work and life with remote flexibility and access to the Shoreditch office
- • Enhanced parental leave, private healthcare, personal development budget, and work-from-anywhere options
Skills & Technologies
About PhysicsX Limited
PhysicsX Limited accelerates industrial innovation by deploying AI to transform physical systems engineering across the entire product lifecycle. Their platform empowers enterprises in critical sectors like Semiconductors, Aerospace & Defense, and Energy & Renewables to rapidly develop and scale AI tools, combining multiphysics inference with numerical simulation for optimized products, addressing global priorities such as climate transition. PhysicsX cultivates innovation with a diverse, globally distributed team. Notably, they partnered with Deutsche Telekom and NVIDIA to deliver sovereign AI infrastructure for Europe.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities
2 days ago
