
Job Overview
Location
Remote
Job Type
Full-time
Category
Software Engineering
Date Posted
October 22, 2025
Full Job Description
đź“‹ Description
- • Own the end-to-end data lifecycle for Upbound’s AI initiatives, from raw ingestion through model-ready datasets, powering the next generation of Crossplane and the Intelligent Control Plane.
- • Architect and maintain scalable, cloud-native data pipelines (batch + streaming) that collect, clean, and enrich telemetry from thousands of Kubernetes clusters, cloud APIs, and customer workloads worldwide.
- • Partner with ML engineers, product, and SRE teams to define data contracts, schema evolution strategies, and governance policies that keep petabyte-scale lakes reliable, secure, and compliant (SOC 2, GDPR, HIPAA).
- • Design real-time feature stores that feed both online inference services and offline training jobs, ensuring sub-second latency for critical control-plane decisions while guaranteeing reproducibility and version control.
- • Build self-service tooling (SDKs, notebooks, observability dashboards) that empowers analysts and data scientists to discover, profile, and experiment with datasets without bottlenecks.
- • Optimize compute and storage costs through intelligent partitioning, incremental processing, and auto-scaling clusters on AWS/GCP, cutting spend by double-digit percentages year-over-year.
- • Implement advanced data quality frameworks—unit tests, anomaly detection, lineage tracking—that surface issues before they reach production models or customer dashboards.
- • Contribute to open-source Crossplane providers and Upbound’s internal “Data as Infrastructure” codebase, turning repeatable patterns into reusable packages the community can adopt.
- • Champion a culture of documentation and knowledge sharing: run internal tech talks, write runbooks, and mentor junior engineers to raise the bar for data excellence across the company.
- • Stay ahead of the curve by evaluating emerging technologies (Iceberg, DuckDB, Flink, vector databases) and running proof-of-concepts that translate into competitive advantages for Upbound’s AI roadmap.
🎯 Requirements
- • 5+ years building production-grade data pipelines in Python, SQL, and at least one JVM language (Scala/Java/Kotlin).
- • Deep expertise with cloud data stacks: S3/GCS, Redshift/BigQuery, EMR/Dataproc, Kinesis/PubSub, Airflow/Mage, dbt, Terraform.
- • Hands-on experience with Kubernetes, Docker, and infrastructure-as-code; familiarity with Crossplane is a strong plus.
- • Proven track record designing real-time streaming architectures (Kafka, Pulsar, Flink) and batch ETL at multi-terabyte scale.
- • Nice-to-have: contributions to open-source data projects, advanced SQL performance tuning, or prior work in ML feature engineering.
🏖️ Benefits
- • Fully remote-first culture with quarterly off-sites in inspiring global locations.
- • Competitive salary + equity package that grows with the company’s valuation.
- • $3,000 annual learning stipend for conferences, courses, and certifications.
- • Flexible PTO policy and 16-week gender-neutral parental leave.
- • Home-office setup budget and monthly wellness stipend.
Skills & Technologies
Remote
About Upbound Technologies Inc.
Upbound Technologies provides cloud-native infrastructure management software built on open-source Crossplane. The platform enables enterprises to provision, compose and consume infrastructure using Kubernetes-native APIs, treating infrastructure as code. Customers automate provisioning across AWS, Azure, GCP and on-prem environments while enforcing policy, cost controls and compliance. The company maintains Crossplane as a CNCF incubating project and offers commercial support, governance and enterprise extensions. Founded by former AWS engineers, Upbound serves DevOps and platform teams needing self-service infrastructure with centralized control.



