This job has expired

This position was posted on March 21, 2026 and is likely no longer accepting applications. We've kept it here for historical reference. Check out the similar jobs below!

AI Platform Engineer

Fastino AI Inc.

Job Overview

Location

Remote

Job Type

Full-time

Full Job Description

📋 Description

• As an AI Platform Engineer at Fastino AI Inc., you will play a pivotal role in building the foundational infrastructure that transforms cutting-edge AI models into scalable, production-ready systems. Your work will directly enable the deployment and optimization of Fastino’s GLiNER family of models—used by industry leaders like NVIDIA, Meta, and Airbnb—ensuring they are efficient, reliable, and accessible at scale. This is a high-impact, systems-focused role where you will own the end-to-end ML platform, from data ingestion to inference, supporting the company’s mission to develop specialized, efficient AI.
• You will design and implement robust, scalable systems for training, fine-tuning, reinforcement learning, and inference, working closely with research and engineering teams to turn experimental models into reliable production services. Your contributions will improve GPU utilization, reduce training costs, enhance reproducibility, and accelerate model iteration cycles—critical factors in advancing Fastino’s competitive edge in the AI landscape.
• You will join a distinguished team of AI researchers and engineers with backgrounds from Google Research, Apple, Stanford, and Cambridge, operating in a remote-first UK-based environment with periodic trips to the Silicon Valley office. Backed by $25M in funding from top-tier investors including Microsoft, Khosla Ventures, and Insight Partners, Fastino is positioned at the forefront of efficient AI innovation, offering a unique opportunity to shape the infrastructure behind next-generation open-source models.
• In this role, you will deepen your expertise in large-scale ML systems, distributed computing, and ML infrastructure while gaining hands-on experience with state-of-the-art techniques in transformer optimization, RLHF, model compression, and reproducible ML workflows. You will have the autonomy to architect solutions that balance performance, cost, and scalability, positioning yourself as a key contributor to both Fastino’s technical success and the broader open-source AI ecosystem.
• Architect distributed fine-tuning pipelines for small encoder and decoder models, ensuring efficient resource utilization and seamless integration with training workflows.
• Implement LoRA, adapters, distillation, and compression workflows to reduce model size and inference latency without sacrificing performance.
• Design experiment tracking, reproducibility, and dataset versioning systems to enable reliable, auditable ML development cycles.
• Optimize training efficiency by improving GPU utilization, memory management, throughput, and cost-effectiveness across distributed workloads.
• Design scalable RL training workflows, including policy optimization and reward modeling, and integrate them with supervised fine-tuning and distillation pipelines.
• Build evaluation loops and automated regression detection mechanisms to maintain model quality and catch performance degradation early.
• Develop scalable ingestion pipelines for structured and unstructured data, supporting diverse data formats and sources.
• Design dataset curation, filtering, and quality enforcement systems to ensure high-quality, consistent inputs for training.
• Implement reproducible data workflows that are tightly coupled with training runs, enabling full traceability from data to model.
• Architect low-latency inference services with scalable backend architecture to support real-time and batch deployment scenarios.
• Design safe production deployment workflows, including monitoring, rollback, and validation processes, to ensure system reliability in production.

🎯 Requirements

• Deep experience with PyTorch and transformer architectures, including hands-on implementation of model training and fine-tuning pipelines.
• Proven experience building production ML systems end-to-end, from data ingestion to model serving in scalable environments.
• Strong background in distributed training and inference, including experience with multi-GPU and multi-node setups.
• Demonstrated expertise in optimizing GPU workloads for memory, throughput, and cost efficiency.
• Solid foundation in backend and systems engineering, with proficiency in containerization (Docker) and orchestration (Kubernetes or similar).
• Practical experience with cloud infrastructure platforms such as AWS, GCP, Modal, or Together.ai.

🏖️ Benefits

• Fully remote position based in the UK with opportunities for periodic travel to the Silicon Valley office to collaborate with the core team.
• Competitive compensation package aligned with top-tier AI startups, including equity participation in a fast-growing, well-funded company.
• Access to cutting-edge AI research and development, working alongside alumni from Google Research, Apple, Stanford, and Cambridge.
• Professional growth opportunities in ML infrastructure, distributed systems, and production ML engineering at the forefront of efficient AI innovation.
• Health, wellness, and flexible working benefits designed to support long-term productivity and well-being in a remote-first environment.

Skills & Technologies

AWS

GCP

Docker

GitHub

PyTorch

Remote

Ready to Apply?

Apply Externally

You will be redirected to an external site to apply.

Fastino AI Inc.

Visit Website

About Fastino AI Inc.

Fastino AI is a technology company focused on developing advanced artificial intelligence solutions. Their core business revolves around creating and deploying AI-powered tools and platforms designed to automate complex processes and enhance decision-making across various industries. They specialize in areas such as machine learning, natural language processing, and computer vision, aiming to provide businesses with innovative ways to leverage data and improve operational efficiency. Fastino AI's offerings cater to clients seeking to integrate cutting-edge AI capabilities into their existing workflows, driving digital transformation and competitive advantage.

View Company Profile

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.