This job has expired

This position was posted on February 17, 2026 and is likely no longer accepting applications. We've kept it here for historical reference. Check out the similar jobs below!

Data Research Engineer

Fundamental

Job Overview

Location

Remote

Job Type

Full-time

Full Job Description

📋 Description

• Are you a highly skilled engineer with a passion for data and a knack for building robust systems? Do you thrive in environments where cutting-edge research meets practical application? Fundamental, a pioneering AI company founded by DeepMind alumni, is seeking a talented Data Research Engineer to join our groundbreaking team. We are at the forefront of developing NEXUS, the world's most powerful Large Tabular Model (LTM), specifically designed to revolutionize enterprise decision-making. Backed by top-tier investors and trusted by Fortune 100 companies, Fundamental is unlocking immense value by empowering businesses with the Power to Predict.
• As a Data Research Engineer at Fundamental, you will play a pivotal role in the development of breakthrough machine learning models. Your primary focus will be on the critical domain of data – the very foundation upon which our advanced AI systems are built. This is a unique opportunity to contribute to unprecedented technical challenges in foundation model development and shape the future of enterprise AI from the ground up. You will be instrumental in ensuring our training pipelines are not only reliable and efficient but also leverage the highest quality data available.
• Your responsibilities will span the entire data lifecycle, from initial identification and characterization to the implementation of sophisticated ETL processes and scalable storage solutions. You will collaborate closely with our world-class research team, ensuring seamless integration of data into our training infrastructure, and work hand-in-hand with our engineering and infrastructure teams to maintain a robust and performant system.
• Key responsibilities include:
• Identifying, characterizing, and rigorously evaluating potential data sources that are crucial for training and evaluating state-of-the-art ML models. This involves a deep understanding of data quality, relevance, and potential biases.
• Designing, building, and maintaining robust and scalable ETL (Extract, Transform, Load) pipelines to ingest data from diverse structured and unstructured sources, transforming it into formats readily accessible and optimized for ML model training.
• Designing and implementing efficient, reliable, and scalable data storage solutions that can handle massive datasets, ensuring data integrity and accessibility for research and development purposes.
• Collaborating closely with the research team to maintain and enhance a reliable and efficient training pipeline, where data quality and availability are paramount to model performance and success.
• Partnering with the wider engineering and infrastructure teams to ensure seamless integration of data pipelines and storage solutions with the overall system architecture, promoting best practices in software engineering and MLOps.
• Contributing to the development and refinement of data validation and quality assurance frameworks to ensure the integrity and reliability of the data used in model training and evaluation.
• Investigating and implementing novel techniques for data augmentation and synthetic data generation to enhance model robustness and generalization capabilities.
• Optimizing data processing workflows for speed and efficiency, particularly when dealing with large-scale datasets and distributed computing environments.
• Staying abreast of the latest advancements in data engineering, machine learning, and distributed systems to continuously improve our data infrastructure and methodologies.
• Documenting data sources, pipelines, and storage solutions to ensure knowledge sharing and maintainability within the team.
• Participating in code reviews and contributing to the overall technical strategy of the data engineering function within the research team.
• Assisting in the development of benchmarks and evaluation metrics for assessing data quality and model performance related to data inputs.
• This role offers a chance to work on foundational AI technology that has the potential to transform how the world's largest companies make critical decisions. You will be an integral part of a fast-paced, innovative environment, contributing directly to the success of a category-defining company.

Skills & Technologies

Python

REST

Pandas

NumPy

Apache Spark

Remote

Degree Required

Ready to Apply?

Apply Externally

You will be redirected to an external site to apply.

AI Job Fit Analysis

Pro

See exactly how your profile matches this role — strengths, skill gaps, and what to do about them.

Fundamental

Visit Website

About Fundamental

Fundamental is a company focused on providing innovative solutions and services. They aim to empower businesses by leveraging cutting-edge technology and expert insights. Their offerings span various sectors, addressing complex challenges with tailored approaches. The company is committed to driving growth and efficiency for its clients through a blend of strategic planning and practical execution. With a strong emphasis on research and development, Fundamental continuously seeks to advance its capabilities and deliver value-added services. Their client-centric model ensures that solutions are aligned with specific business objectives, fostering long-term partnerships and mutual success. Fundamental strives to be a reliable partner in navigating the evolving business landscape.

View Company Profile

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.