
Job Overview
Location
Remote
Job Type
Full-time
Category
Machine Learning Engineer
Date Posted
February 25, 2026
Full Job Description
đź“‹ Description
- • Join Poolside AI, a pioneering company at the forefront of Artificial General Intelligence (AGI) development, and play a pivotal role in accelerating the creation of foundational models for source code generation. We are on a mission to reshape the future of software development by building intelligent agentic systems and cutting-edge coding assistants, powered by frontier LLMs, and deploying them directly into the development environments of security-conscious enterprises.
- • As a Member of Engineering on our Pre-training team, you will be instrumental in building and optimizing the distributed training of Large Language Models (LLMs) at an unprecedented scale. This is a deeply hands-on role where your expertise in custom kernel development will directly impact the speed and efficiency of our large-scale training runs, leveraging access to thousands of GPUs to validate and refine your innovations.
- • Your primary mission will be to significantly accelerate the training of the world's most advanced foundational models for source code generation. This involves a multifaceted approach to performance optimization, tackling both computational and communication bottlenecks inherent in training massive models across distributed hardware.
- • You will be responsible for profiling extensive large-scale training workloads to pinpoint areas for improvement. This requires a deep understanding of performance analysis tools and methodologies to identify where computational tasks or inter-GPU communication are slowing down the training process.
- • A core aspect of your role will be custom kernel development. You will write highly optimized code, likely in CUDA, to enhance specific parts of the training pipeline, pushing the boundaries of what's possible on GPU hardware. This could involve optimizing matrix multiplications, attention mechanisms, or other critical operations within the Transformer architecture.
- • You will collaborate closely with our research team, translating novel research ideas into efficiently scalable training implementations. This requires bridging the gap between theoretical advancements and practical, high-performance engineering, ensuring that groundbreaking research can be effectively deployed at scale.
- • You will contribute to the enhancement and maintenance of our sophisticated training and inference codebases. This includes refactoring existing code for better performance, readability, and maintainability, as well as introducing new features and optimizations.
- • The role demands high-quality code development in Python (with PyTorch or Jax), Cython, C/C++, and CUDA. You will be expected to write clean, efficient, and well-documented code, adhering to modern software engineering best practices.
- • You will actively participate in the team's collaborative process, contributing to planning future development steps, engaging in technical discussions, and maintaining open communication channels to ensure alignment and collective progress.
- • This position offers a unique opportunity to work with state-of-the-art infrastructure and contribute to a project with the potential to define the future of AI. You will be challenged to think critically, question existing approaches, and continuously seek out better ways to optimize performance and scalability.
- • We foster a culture of rapid learning and encourage stepping outside comfort zones. The steep learning curve associated with cutting-edge AI development is embraced as an opportunity for growth, supported by a team of low-ego, kind-hearted individuals dedicated to our shared mission.
- • Your work will directly contribute to building the intelligence systems that will power the next generation of software development, making economically valuable work and scientific progress more accessible through AI.
- • We are building the company that will achieve AGI, and this role is critical to our success in scaling our training capabilities to meet that ambitious goal. You will be part of a fast-moving environment where stacking advantages and pulling ahead is paramount.
🎯 Requirements
- • Deep understanding of Large Language Models (LLMs), including foundational knowledge of the Transformer architecture.
- • Proven experience with distributed training methodologies for deep learning models.
- • Strong background and hands-on experience in CUDA and GPU programming, including development with libraries like NCCL, CUTLASS, and CUBLAS, and an understanding of hardware interconnects such as NVLink and NVSwitch.
- • Robust software engineering skills, including proficiency in Linux environments, strong algorithmic understanding, and extensive programming experience in Python (PyTorch or Jax), Cython, and C/C++.
🏖️ Benefits
- • Fully remote work arrangement with flexible working hours to accommodate different time zones (EMEA/East Coast).
- • Generous annual leave allowance of 37 days, encompassing vacation and public holidays.
- • Comprehensive health insurance coverage provided for both the employee and their dependents.
- • Provision of high-quality, company-funded equipment necessary for your role.
- • Dedicated allowances for well-being, continuous learning, and home office setup to support a healthy and productive work environment.
- • Regular in-person team gatherings and off-sites to foster collaboration and camaraderie.
- • A diverse, inclusive, and people-first company culture that values collaboration, intellectual curiosity, and mutual respect.
Skills & Technologies
Python
Linux
PyTorch
Remote
About Poolside AI, Inc.
Poolside AI develops and operates a cloud-based platform that turns natural-language prompts into functioning software. Using large-scale language models trained on public and proprietary code, the system autonomously writes, tests, and refines programs, enabling users to create applications, scripts, and data workflows without traditional coding. Founded in 2023 and headquartered in Paris with offices in New York, the company serves individual developers, startups, and enterprise teams seeking faster, more accessible software creation.



