This job has expired

This position was posted on April 13, 2026 and is likely no longer accepting applications. We've kept it here for historical reference. Check out the similar jobs below!

Senior Machine Learning Engineer, Pegasus

Twelve Labs Inc.

Job Overview

Location

Seoul, South Korea

Job Type

Full-time

Full Job Description

📋 Description

• Lead complex ML systems work across Pegasus from design through production, especially in areas with greater technical ambiguity or system complexity.
• Make strong design and architectural decisions across deployment, inference, evaluation, monitoring, and ML infrastructure within your domain.
• Improve critical parts of the ML lifecycle so research advances can be integrated into production quickly and reliably.
• Drive improvements to model serving, inference architecture, and ML workflows for Video Language Models (VLMs) in production.
• Support other engineers through technical guidance, design reviews, and strong engineering judgment.
• Explore and adopt AI-assisted development tools such as Claude, Gemini, and GPT to improve productivity across coding, experimentation, debugging, and documentation.
• The Pegasus team sits at the core of TwelveLabs' video understanding capabilities and is responsible for driving Pegasus, our Video Analysis product.
• Our focus is on developing multimodal video analysis systems that are designed for high instruction following capability and producing highly complex, hierarchically structured outputs.
• We focus on shipping products with real-world value rather than doing research in isolation, and we work in a goal-oriented, cross-functional team that encompasses both ML researchers and engineers.
• Our work covers a broad range of challenges: large-scale distributed training of multi-modal LLMs that span from pre-training to RL, accurate temporal segmentation and structured metadata extraction for real-world use cases, extending temporal context length to multiple hours, and data curation processes that enable well-aligned evaluation and performance improvements through training data enhancements.
• Our team has access to the most advanced chips in the world, including NVIDIA B300s, to push the boundaries of video analysis systems—accelerating our research-to-production cycle as fast as possible.
• At TwelveLabs, we are pioneering the development of cutting-edge multimodal foundation models that have the ability to comprehend videos just like humans do.
• Our models have redefined the standards in video-language modeling, empowering us with more intuitive and far-reaching capabilities, and fundamentally transforming the way we interact with and analyze various forms of media.
• With a $110+ million in Seed and Series A funding, our company is backed by top-tier venture capital firms such as NVIDIA’s NVentures, NEA, Radical Ventures, and Index Ventures, and prominent AI visionaries and founders such as Fei-Fei Li, Silvio Savarese, Alexandr Wang and more.
• Headquartered in San Francisco, with an influential APAC presence in Seoul, our global footprint underscores our commitment to driving worldwide innovation.
• Our partnership with NVIDIA and AWS gives us access to the most advanced chips, including B300s, enabling us to push the boundaries of what's possible in video AI.
• We are a global company that values the uniqueness of each person’s journey. It is the differences in our cultural, educational, and life experiences that allow us to constantly challenge the status quo. We are looking for individuals who are motivated by our mission and eager to make an impact as we push the bounds of technology to transform the world.
• Join us as we revolutionize video understanding and multimodal AI.

Skills & Technologies

AWS

Kubernetes

Senior

Onsite

Degree Required

Ready to Apply?

Apply Externally

You will be redirected to an external site to apply.

AI Job Fit Analysis

Pro

See exactly how your profile matches this role — strengths, skill gaps, and what to do about them.

Twelve Labs Inc.

Visit Website

About Twelve Labs Inc.

Twelve Labs builds multimodal video understanding AI. Its cloud platform transforms long-form video into vector embeddings that capture visual, audio, speech and contextual information, enabling semantic search, summarization, chaptering, moderation and analytics through a single API. Developers upload video, index it, then query in natural language or image to retrieve exact moments, generate highlights or detect unwanted content. Models are pretrained on large-scale web video, continually fine-tuned for accuracy and latency, and deployable on dedicated GPU clusters for enterprise security. Founded in 2021, the San Francisco company serves media, ed-tech, safety and e-commerce customers worldwide.

View Company Profile

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.