This job has expired

This position was posted on October 22, 2025 and is likely no longer accepting applications. We've kept it here for historical reference. Check out the similar jobs below!

Data Engineer - AI (REMOTE)

Upbound Inc.

Job Overview

Location

Remote

Job Type

Full-time

Full Job Description

📋 Description

• Own the end-to-end data lifecycle for Upbound’s AI initiatives, from raw ingestion through model-ready datasets, powering the next generation of Crossplane and the Intelligent Control Plane.
• Architect and maintain scalable, cloud-native data pipelines (batch + streaming) that collect, clean, and enrich telemetry from thousands of Kubernetes clusters, cloud APIs, and customer workloads worldwide.
• Partner with ML engineers, product, and SRE teams to define data contracts, schema evolution strategies, and governance policies that keep petabyte-scale lakes reliable, secure, and compliant (SOC 2, GDPR, HIPAA).
• Design real-time feature stores that feed both online inference services and offline training jobs, ensuring sub-second latency for critical control-plane decisions while guaranteeing reproducibility and version control.
• Build self-service tooling (SDKs, notebooks, observability dashboards) that empowers analysts and data scientists to discover, profile, and experiment with datasets without bottlenecks.
• Optimize compute and storage costs through intelligent partitioning, incremental processing, and auto-scaling clusters on AWS/GCP, cutting spend by double-digit percentages year-over-year.
• Implement advanced data quality frameworks—unit tests, anomaly detection, lineage tracking—that surface issues before they reach production models or customer dashboards.
• Contribute to open-source Crossplane providers and Upbound’s internal “Data as Infrastructure” codebase, turning repeatable patterns into reusable packages the community can adopt.
• Champion a culture of documentation and knowledge sharing: run internal tech talks, write runbooks, and mentor junior engineers to raise the bar for data excellence across the company.
• Stay ahead of the curve by evaluating emerging technologies (Iceberg, DuckDB, Flink, vector databases) and running proof-of-concepts that translate into competitive advantages for Upbound’s AI roadmap.

🎯 Requirements

• 5+ years building production-grade data pipelines in Python, SQL, and at least one JVM language (Scala/Java/Kotlin).
• Deep expertise with cloud data stacks: S3/GCS, Redshift/BigQuery, EMR/Dataproc, Kinesis/PubSub, Airflow/Mage, dbt, Terraform.
• Hands-on experience with Kubernetes, Docker, and infrastructure-as-code; familiarity with Crossplane is a strong plus.
• Proven track record designing real-time streaming architectures (Kafka, Pulsar, Flink) and batch ETL at multi-terabyte scale.
• Nice-to-have: contributions to open-source data projects, advanced SQL performance tuning, or prior work in ML feature engineering.

🏖️ Benefits

• Fully remote-first culture with quarterly off-sites in inspiring global locations.
• Competitive salary + equity package that grows with the company’s valuation.
• $3,000 annual learning stipend for conferences, courses, and certifications.
• Flexible PTO policy and 16-week gender-neutral parental leave.
• Home-office setup budget and monthly wellness stipend.

Skills & Technologies

Remote

Ready to Apply?

Apply Externally

You will be redirected to an external site to apply.

Upbound Inc.

Visit Website

About Upbound Inc.

Upbound Inc. builds the open-source Crossplane project and offers Upbound Cloud, a managed control-plane platform that lets organizations provision, compose and manage infrastructure and services across any cloud or on-premises environment through a single Kubernetes-based API. Founded in 2017 by the creators of Crossplane, the company delivers universal cloud-native control planes, enabling platform teams to standardize multi-cloud resources, enforce policies and accelerate developer self-service while maintaining security and compliance. Upbound supports enterprise customers with commercial extensions, professional services and hosted management for Crossplane deployments.

View Company Profile