Protege Inc. logo

Solutions Engineer (Media)

Job Overview

Location

Remote

Job Type

Full-time

Category

Data Science

Date Posted

April 1, 2026

Full Job Description

đź“‹ Description

  • • As a Solutions Engineer (Media) at Protege Inc., you will play a critical role in solving one of AI’s most pressing challenges: securing access to high-quality training data. You will bridge the gap between Protege’s expanding media catalog and the specific AI data needs of customers, ensuring that complex, real-world audio, video, and motion capture datasets are curated, validated, and delivered with precision and speed. This role is essential to enabling breakthroughs in multimodal AI by turning messy, imperfect partner data into reliable, model-ready assets.
  • • You will own the end-to-end data curation lifecycle for media datasets — translating customer and sales requirements into actionable curation strategies, normalizing heterogeneous data sources, validating dataset integrity, and iterating based on feedback to ensure final deliverables meet both technical and conceptual AI use case needs. Your work will directly impact the success of active deals and the scalability of Protege’s data delivery platform.
  • • You will join a lean, fast-moving, high-trust team of builders who are deeply committed to solving the AI data problem with velocity, integrity, and collaboration. Protege’s culture emphasizes ownership, clarity, and kindness, empowering individuals to thrive in ambiguity while driving meaningful impact in a mission-driven environment.
  • • In this role, you will develop deep expertise in media data ecosystems, including metadata schemas, embedding-based search, and multimodal AI workflows. You will gain hands-on experience shaping data product decisions, influencing catalog sourcing strategies, and building reusable workflows that scale across the organization — positioning you as a trusted expert at the intersection of data, product, and customer success.
  • • [What the person will do day to day]
  • • Partner with Sales and Solutions teams to interpret customer AI use cases and translate nuanced requirements — such as modality, duration, diversity, and labeling needs — into precise data curation strategies for Protege’s media catalog.
  • • Query, explore, and analyze Protege’s growing media catalog using SQL, internal APIs, and metadata tools to identify relevant audio, video, and motion capture assets, even when metadata is incomplete, inconsistent, or misaligned across partners.
  • • Normalize and standardize disparate media datasets by resolving schema differences, correcting mismatched labels, and aligning file structures to ensure reliability and usability for downstream AI training and inference.
  • • Build and implement validation checks, automated workflows, and quality assurance protocols to detect and resolve data integrity issues — including file corruption, metadata drift, and content misalignment — before dataset delivery.
  • • Leverage AI-powered tools, such as TwelveLabs embeddings, to surface and refine clip-level content from longform assets, enabling efficient, similarity-based search and retrieval tailored to customer specifications.
  • • Conduct iterative sample reviews with customers and internal stakeholders, incorporating feedback to refine selections, improve relevance, and ensure final data packages meet agreed-upon technical, ethical, and licensing requirements.
  • • [About the team or company]
  • • Protege is a venture-backed startup on a mission to solve the AI data bottleneck by creating a secure, privacy-centric platform for the exchange of high-quality training data — already trusted by leading AI teams and powered by world-class investors.
  • • The team operates with high trust, radical clarity, and a bias toward action, valuing ownership, kindness, and continuous learning as core cultural pillars that enable rapid iteration and meaningful impact.
  • • [What the person can learn or achieve in this role]
  • • Develop advanced expertise in media data curation, embedding-based search, and multimodal AI data pipelines — skills that are increasingly critical as demand for video, audio, and motion data in AI continues to grow.
  • • Influence Protege’s product roadmap, catalog sourcing strategy, and delivery platform design by providing frontline insights from customer interactions and data quality observations, positioning you as a key driver of scalability and innovation.

🎯 Requirements

  • • 4-7 years of hands-on experience in data science, media analytics, technical data curation, or related roles involving the preparation, validation, or delivery of complex datasets for analytical or machine learning use.
  • • Strong proficiency in SQL for querying, filtering, and aggregating large, messy, and semi-structured datasets — including experience working with nested metadata, time-series logs, and heterogeneous data sources.
  • • Demonstrated ability to work with media-specific data types, such as video/audio metadata, multimodal embeddings (e.g., TwelveLabs, CLIP), or unstructured content, and to translate ambiguous customer or model requirements into concrete, actionable data specifications.

🏖️ Benefits

  • • Fully remote work environment with flexibility to design your ideal work setup, supported by a company that values autonomy, trust, and results over presenteeism.
  • • Opportunity to work on a generational problem in AI — solving the data access bottleneck — with real-world impact, backed by top-tier investors and already enabling partnerships with cutting-edge AI teams.
  • • A high-trust, kind, and inclusive culture where feedback is frequent and constructive, ownership is real, and learning is encouraged — enabling rapid personal and professional growth in a mission-driven startup.

Skills & Technologies

Remote

Ready to Apply?

You will be redirected to an external site to apply.

Protege Inc. logo
Protege Inc.
Visit Website

About Protege Inc.

Protege is a career development platform that helps early-career talent connect directly with industry mentors and secure paid apprenticeships. The company partners with employers to create short-term, project-based experiences that give participants real work opportunities while companies evaluate candidates for full-time roles. Its marketplace offers mentorship, skill-building projects, and application tools designed to reduce hiring bias and widen access to competitive industries such as tech, finance, and media. Founded in 2020 and headquartered in New York City, Protege has facilitated thousands of placements and aims to replace traditional campus recruiting with scalable experiential hiring programs.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

❌ EXPIRED
Argentina - Remote
Full-time
Expired Apr 25, 2026
JavaScript
TypeScript
React
+3 more

3 months ago

Apply
❌ EXPIRED
Onhires Inc. logo

Onhires Inc.

Latin America
Full-time
Expired May 11, 2026
Remote

2 months ago

Apply
❌ EXPIRED
Argentina - Buenos Aires
Contract
Expired Apr 28, 2026
Python
AWS
Azure
+4 more

3 months ago

Apply
Argentina - Fully Remote
Contract
Expires Jun 21, 2026
Python
Remote
Degree Required

24 days ago

Apply