
Job Overview
Location
Cambridge
Job Type
Full-time
Category
Software Engineering
Date Posted
April 24, 2026
Full Job Description
📋 Description
- • Design, implement, and optimize custom CUDA kernels for NVIDIA GPUs to maximize occupancy, memory throughput, and warp efficiency for high-throughput AI systems serving Fortune 500 customers.
- • Profile GPU workloads using Nsight Compute, Nsight Systems, nvprof, and CUDA-MEMCHECK to analyze and eliminate performance bottlenecks such as warp divergence, uncoalesced memory access, register pressure, and PCIe transfer overhead.
- • Collaborate with AI systems, model acceleration, and backend distributed systems teams to improve GPU memory pipelines and contribute to kernel libraries and performance-engineering best practices.
- • Work on the GPU performance layer of a fast-growing AI startup founded by MIT CSAIL researchers, directly influencing the scalability and efficiency of mission-critical AI systems at scale.
Skills & Technologies
About Pragmatike Soluciones Tecnológicas S.L.
Spanish technology firm founded in 2014, delivering custom software, mobile apps, cloud migration, and data analytics. Combines agile development, AI, and DevOps practices to serve finance, healthcare, retail, and public sectors across Europe and Latin America. Core services include UX/UI design, QA automation, and 24/7 managed support, with ISO 27001-certified processes and multilingual teams in Madrid, Barcelona, and remote hubs.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities

Workato, Inc.
3 days ago

Aquia Inc.
7 months ago

