OpenAI, Inc. logo

Senior Data Engineer, Core Experimentation

Job Overview

Location

Seattle

Job Type

Full-time

Category

Data Engineer

Date Posted

May 16, 2026

Full Job Description

đź“‹ Description

  • • Design, build, and manage data pipelines that seamlessly integrate user event data into the company’s data warehouse to power product development, safety systems, and business decisions.
  • • Develop canonical datasets to track core product metrics including user growth, engagement, and revenue across OpenAI’s platforms.
  • • Collaborate closely with Infrastructure, Data Science, Product, Marketing, Finance, and Research teams to understand their data requirements and deliver scalable, reliable solutions.
  • • Implement robust, fault-tolerant systems for data ingestion and processing that ensure high availability and accuracy under massive scale.
  • • Participate in data architecture and engineering decisions, leveraging experience to guide system design, scalability, and long-term maintainability.
  • • Ensure all data systems adhere to industry and company standards for security, integrity, and compliance with data governance policies.
  • • Optimize and debug Spark code to efficiently process large-scale datasets used for experimentation, analytics, and model training.
  • • Utilize distributed processing frameworks such as Hadoop, Flink, and distributed storage systems (e.g., HDFS, S3) to support high-throughput data workflows.
  • • Operate and maintain ETL schedulers including Airflow, Dagster, or Prefect to orchestrate complex data workflows with dependencies and error handling.
  • • Work in a hybrid model based in Bellevue, prioritizing in-person collaboration for technical design, iterative development, and cross-functional alignment.
  • • Enable researchers behind ChatGPT and other AI models by providing trusted, high-fidelity data pipelines that inform model training and evaluation.
  • • Contribute to the experimentation platform that powers statistical rigor and decision-making across OpenAI’s product and safety initiatives.
  • • Build systems that are trusted by engineers and researchers to support critical business outcomes including product growth and prevention of harmful user behaviors.
  • • Maintain data quality and lineage across the organization to ensure reproducibility and auditability of analytical insights.
  • • Translate complex business and research needs into technical data solutions that are scalable, performant, and aligned with OpenAI’s mission of safe and beneficial AI deployment.

🎯 Requirements

  • • 3+ years of experience as a data engineer and 8+ years of any software engineering experience (including data engineering)
  • • Proficiency in at least one programming language commonly used within Data Engineering, such as Python, Scala, or Java
  • • Experience with distributed processing technologies and frameworks, such as Hadoop, Flink, and distributed storage systems (e.g., HDFS, S3)
  • • Expertise with ETL schedulers such as Airflow, Dagster, Prefect, or similar frameworks
  • • Solid understanding of Spark and ability to write, debug, and optimize Spark code
  • • This role is based in Bellevue with a hybrid work model requiring in-person collaboration for technical design and cross-functional partnership

🏖️ Benefits

  • • Compensation range of $293K - $325K USD
  • • Hybrid work model based in Bellevue with emphasis on in-person collaboration for technical design and cross-functional partnership
  • • Opportunity to collaborate directly with researchers behind ChatGPT and other frontier AI models
  • • Work on systems that power experimentation, safety, and decision-making at the scale of OpenAI’s global AI products

Skills & Technologies

Python
Java
Scala
Apache Spark
Senior
Hybrid
$293k-325k

Ready to Apply?

You will be redirected to an external site to apply.

OpenAI, Inc. logo
OpenAI, Inc.
Visit Website

About OpenAI, Inc.

OpenAI is a San Francisco-based artificial intelligence research and deployment company founded in 2015. It develops large-scale AI models such as GPT, DALL-E, and Codex, providing cloud APIs and consumer applications like ChatGPT. Originally established as a non-profit, it later created a capped-profit subsidiary to attract capital while maintaining its mission to ensure artificial general intelligence benefits all of humanity.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

Expired
Dallas, TX
Full-time
Expired May 18, 2026
Python
Azure
Onsite

3 months ago

Apply
Remote Nationwide
Full-time
Expires Jul 26, 2026
Python
Senior
Remote
+2 more

11 days ago

Apply
Expired
Remote
Full-time
Expired Apr 13, 2026
Senior
Remote

4 months ago

Apply
Expired
Scale Army Careers logo

Scale Army Careers

Remote
Full-time
Expired Apr 13, 2026
Python
Pandas
Remote

4 months ago

Apply