This job has expired
This position was posted on November 2, 2025 and is likely no longer accepting applications. We've kept it here for historical reference. Check out the similar jobs below!

Job Overview
Location
Indiana, USA
Job Type
Full-time
Category
Software Engineering
Date Posted
November 2, 2025
Full Job Description
đź“‹ Description
- • Architect the beating heart of enterprise AI infrastructure. You will own the end-to-end technical vision for Unstructured’s data-transformation and retrieval platform, deciding how petabytes of PDFs, images, audio, video, and text flow from ingestion to vector-ready embeddings that feed the world’s largest LLMs.
- • Design and evolve distributed systems that must stay performant under 10× traffic spikes and 100× data growth. You will choose every protocol, queue, cache, and partitioning scheme that keeps latency in milliseconds while costs stay predictable.
- • Become the company-wide authority on Kubernetes. You will design multi-region clusters, craft custom operators, tune the scheduler, and teach the entire engineering org how to ship zero-downtime releases and blue-green rollouts at scale.
- • Shape Python architecture across 40+ micro-services. You will define coding standards, create internal frameworks, profile hot paths in Cython or Rust bindings, and ensure that every new feature ships with type hints, exhaustive tests, and sub-second cold-start times.
- • Master Postgres at the index and transaction-log level. You will design partitioned schemas for metadata, craft BRIN, GIN, and GiST indexes for hybrid search, and squeeze every last IOPS out of NVMe storage so that retrieval queries never block model inference.
- • Translate bleeding-edge research into production reality. When the science team ships a new embedding model or retrieval algorithm, you will decide how to shard it, cache it, and scale it to 10k QPS without breaking SLAs.
- • Lead design reviews that raise the bar for the entire org. Your feedback will turn junior designs into fault-tolerant, cost-aware architectures, and your example code will become the golden standard for readability and performance.
- • Evaluate and integrate emerging open-source tools—vector databases, orchestration frameworks, observability stacks—and decide which ones deserve a permanent place in our stack versus a polite “thank you, next.”
- • Mentor senior engineers through pair programming, architecture walkthroughs, and incident post-mortems, creating a culture where deep systems knowledge is shared, not siloed.
- • Partner with product and infrastructure leadership to balance technical debt, feature velocity, and cloud spend, turning quarterly OKRs into concrete engineering milestones that the whole company can rally behind.
- • Own reliability and security from first principles. You will write runbooks, define SLOs, and ensure that every service has circuit breakers, rate limiting, and least-privilege IAM baked in from day one.
- • Influence the open-source community. Whether through conference talks, upstream contributions, or blog posts, you will represent Unstructured as a thought leader in distributed systems and AI infrastructure.
Skills & Technologies
About Unstructured
Unstructured Technologies, Inc. develops open-source software that converts enterprise documents—PDFs, Word files, emails, HTML, images, and more—into clean, normalized JSON ready for downstream large-language-model and vector-database ingestion. Founded in 2022 and headquartered in San Francisco, the company offers a Python library, pre-built connectors, and a managed cloud service that automate extraction, chunking, and metadata tagging at scale. It targets data engineers, ML teams, and AI product builders who need reliable, production-grade preprocessing for retrieval-augmented generation, fine-tuning, and analytics workflows.
Similar Opportunities

ICF International, Inc.
5 days ago

Harris Computer Systems Corporation
5 days ago

