Fundamental logo

Data Research Engineer

Job Overview

Location

Remote

Job Type

Full-time

Category

Data Engineer

Date Posted

February 17, 2026

Full Job Description

đź“‹ Description

  • • Are you a highly skilled engineer with a passion for data and a knack for building robust systems? Do you thrive in environments where cutting-edge research meets practical application? Fundamental, a pioneering AI company founded by DeepMind alumni, is seeking a talented Data Research Engineer to join our groundbreaking team. We are at the forefront of developing NEXUS, the world's most powerful Large Tabular Model (LTM), specifically designed to revolutionize enterprise decision-making. Backed by top-tier investors and trusted by Fortune 100 companies, Fundamental is unlocking immense value by empowering businesses with the Power to Predict.
  • • As a Data Research Engineer at Fundamental, you will play a pivotal role in the development of breakthrough machine learning models. Your primary focus will be on the critical domain of data – the very foundation upon which our advanced AI systems are built. This is a unique opportunity to contribute to unprecedented technical challenges in foundation model development and shape the future of enterprise AI from the ground up. You will be instrumental in ensuring our training pipelines are not only reliable and efficient but also leverage the highest quality data available.
  • • Your responsibilities will span the entire data lifecycle, from initial identification and characterization to the implementation of sophisticated ETL processes and scalable storage solutions. You will collaborate closely with our world-class research team, ensuring seamless integration of data into our training infrastructure, and work hand-in-hand with our engineering and infrastructure teams to maintain a robust and performant system.
  • • Key responsibilities include:
  • • Identifying, characterizing, and rigorously evaluating potential data sources that are crucial for training and evaluating state-of-the-art ML models. This involves a deep understanding of data quality, relevance, and potential biases.
  • • Designing, building, and maintaining robust and scalable ETL (Extract, Transform, Load) pipelines to ingest data from diverse structured and unstructured sources, transforming it into formats readily accessible and optimized for ML model training.
  • • Designing and implementing efficient, reliable, and scalable data storage solutions that can handle massive datasets, ensuring data integrity and accessibility for research and development purposes.
  • • Collaborating closely with the research team to maintain and enhance a reliable and efficient training pipeline, where data quality and availability are paramount to model performance and success.
  • • Partnering with the wider engineering and infrastructure teams to ensure seamless integration of data pipelines and storage solutions with the overall system architecture, promoting best practices in software engineering and MLOps.
  • • Contributing to the development and refinement of data validation and quality assurance frameworks to ensure the integrity and reliability of the data used in model training and evaluation.
  • • Investigating and implementing novel techniques for data augmentation and synthetic data generation to enhance model robustness and generalization capabilities.
  • • Optimizing data processing workflows for speed and efficiency, particularly when dealing with large-scale datasets and distributed computing environments.
  • • Staying abreast of the latest advancements in data engineering, machine learning, and distributed systems to continuously improve our data infrastructure and methodologies.
  • • Documenting data sources, pipelines, and storage solutions to ensure knowledge sharing and maintainability within the team.
  • • Participating in code reviews and contributing to the overall technical strategy of the data engineering function within the research team.
  • • Assisting in the development of benchmarks and evaluation metrics for assessing data quality and model performance related to data inputs.
  • • This role offers a chance to work on foundational AI technology that has the potential to transform how the world's largest companies make critical decisions. You will be an integral part of a fast-paced, innovative environment, contributing directly to the success of a category-defining company.

Skills & Technologies

Python
REST
Pandas
NumPy
Apache Spark
Remote
Degree Required

Ready to Apply?

You will be redirected to an external site to apply.

Fundamental logo
Fundamental
Visit Website

About Fundamental

Fundamental is a company focused on providing innovative solutions and services. They aim to empower businesses by leveraging cutting-edge technology and expert insights. Their offerings span various sectors, addressing complex challenges with tailored approaches. The company is committed to driving growth and efficiency for its clients through a blend of strategic planning and practical execution. With a strong emphasis on research and development, Fundamental continuously seeks to advance its capabilities and deliver value-added services. Their client-centric model ensures that solutions are aligned with specific business objectives, fostering long-term partnerships and mutual success. Fundamental strives to be a reliable partner in navigating the evolving business landscape.

Similar Opportunities

Remote
Full-time
Expires Apr 18, 2026
Python
Senior
Remote

8 days ago

Apply
Austin
Full-time
Expires Mar 5, 2026
Python
AWS
Azure
+4 more

2 months ago

Apply
❌ EXPIRED
Kpler S.A.S. logo

Kpler S.A.S.

Germany
Full-time
Expired Nov 13, 2025
Python
Scala
Flask
+3 more

5 months ago

Apply
Akersystems LLC logo

Akersystems LLC

Remote
Full-time
Expires Mar 14, 2026
Senior
Remote

1 month ago

Apply