
Job Overview
Location
Santa Clara
Job Type
Full-time
Category
Software Engineer
Date Posted
May 16, 2026
Full Job Description
📋 Description
- • Develop, enhance, and maintain software kernels for next-generation AI compute hardware, ensuring optimal performance and integration with proprietary AI architectures.
- • Map computational graphs from AI frameworks (e.g., TensorFlow, PyTorch) to the underlying hardware, translating high-level operations into efficient low-level implementations.
- • Collaborate with compiler experts to design and build compiler infrastructure that bridges high-level ML models with custom hardware instruction sets.
- • Implement and optimize core machine learning operators including GEMMs, convolutions, BLAS, softmax, layer normalization, pooling, and SIMD-based operations for specialized accelerators.
- • Leverage C/C++ and Python in Linux environments to develop high-performance, low-latency software components tailored for embedded SIMD vector processors like Tensilica.
- • Work closely with hardware teams (mixed signal, DSP, CPU) to co-design hardware-software solutions, balancing trade-offs in power, throughput, memory bandwidth, and latency.
- • Optimize software deliverables within tight development timelines, ensuring scalability and reliability across diverse AI workloads.
- • Integrate and validate kernel implementations using industry-standard tools and debugging methodologies in embedded and accelerated computing environments.
- • Participate in full-stack toolchain development, from algorithm design to deployment, ensuring end-to-end functionality across ML frameworks, compilers, and hardware targets.
- • Contribute to the productization of the software stack by refining APIs, documentation, and testing protocols for internal and external adoption.
- • Translate research-grade algorithms into production-ready code, addressing real-world constraints such as memory footprint, data movement, and hardware-specific quirks.
- • Drive technical ownership of key software modules, mentoring junior engineers and leading technical design reviews with cross-functional teams.
- • Maintain deep awareness of advancements in AI hardware, ML compilers, and accelerator technologies to continuously improve kernel efficiency and feature sets.
- • Engage in daily collaboration with ML engineers, systems engineers, and hardware designers to align software capabilities with evolving hardware specifications and performance goals.
🎯 Requirements
- • MS in computer engineering, math, physics, or related field with 5+ years of industry experience OR PhD in computer engineering, math, physics, or related field with 1+ years of industry experience
- • Strong grasp of computer architecture, data structures, system software, and machine learning fundamentals
- • Proficient in C/C++ and Python development in Linux environments using standard development tools
- • Experience implementing algorithms for specialized hardware such as FPGAs, DSPs, GPUs, and AI accelerators using libraries like CUDA
- • Experience implementing ML operators (GEMMs, convolutions, BLAS, SIMD ops like softmax, layer norm, pooling)
- • Experience with embedded SIMD vector processors such as Tensilica
🏖️ Benefits
- • Hybrid work model with 3+ days per week onsite at Santa Clara, CA headquarters
- • Equal opportunity workplace with inclusive, collaborative culture
- • Opportunity to work at the forefront of generative AI hardware and software innovation
- • Direct communication environment valuing humility and team-driven execution
Skills & Technologies
See exactly how your profile matches this role — strengths, skill gaps, and what to do about them.
About d-Matrix Corporation
d-Matrix designs silicon for high-efficiency AI inference at scale. Its Corsair compute platform combines in-memory computing with a digital approach to slash latency and energy use in transformer and generative workloads. Targeting hyperscale data centers and edge deployments, the company offers hardware and software stacks that integrate into existing AI pipelines. Founded in 2019 and headquartered in Santa Clara, California, d-Matrix serves cloud and enterprise customers seeking cost-effective alternatives to GPUs for large language model serving.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities

Web.com Group, Inc.
3 months ago

Ryzlabs Inc.
4 months ago

Anyone AI Inc.
3 months ago

Anyone AI Inc.
3 months ago