Innodata Inc. logo

Innodata Sr Language Data Scientist

Job Overview

Location

Remote job

Job Type

Full-time

Category

Data Scientist

Date Posted

February 23, 2026

Full Job Description

đź“‹ Description

  • • Innodata is seeking a highly skilled and experienced Senior Language Data Scientist to join our dynamic team, focusing on advancing Generative AI (GenAI) applications for our global clientele. As a pivotal member of our AI technology solutions provider, you will be instrumental in shaping the future of AI by working hands-on with complex, multi-modal, and multi-lingual datasets. This role demands a unique blend of expertise in computational linguistics, human evaluation, data science, and data engineering, enabling you to drive innovation and continuous improvement through sophisticated human and synthetic data workflows.
  • • In this senior capacity, you will lead and own critical processes involved in the creation, validation, and annotation of data essential for Large Language Model (LLM) and Machine Learning (ML) applications. Your responsibilities will span across various data types, including natural language, as well as multimodal data such as images, video, and audio, ensuring our AI models are trained on comprehensive and high-quality datasets.
  • • A key aspect of your role will involve deep customer engagement. You will consult directly with clients to thoroughly understand their business objectives and translate these needs into robust data processing and annotation strategies. By generating insightful analyses of client processes and products, you will identify opportunities for enhancement and innovation, directly contributing to their success and the advancement of our service offerings.
  • • You will also play a crucial advisory role, supporting business unit heads in their client interactions. This includes understanding the upstream activities that will leverage Innodata's services and ensuring our solutions are perfectly aligned with client requirements from the outset.
  • • The Senior Language Data Scientist will be responsible for leading long-term projects characterized by high complexity and ambiguity, managing them from initial client discussions through to successful completion. This involves designing and refining workflows for AI/ML training and evaluation, encompassing human annotation, data collection, and synthetic data generation techniques.
  • • You will be expected to dive deep into existing workflows and processes, meticulously gathering data and insights to formulate data-driven recommendations. Driving improvement through innovation and fostering cross-functional collaboration with customers will be paramount.
  • • A critical component of this role is the rigorous assessment of annotation tooling and workflows. You will quantitatively analyze large datasets, conduct statistical analyses, calculate key performance metrics, and propose actionable recommendations to enhance accuracy, efficiency, and overall model performance.
  • • Close collaboration with client stakeholders is essential. You will work hand-in-hand with them to understand their goals, gather detailed requirements, propose tailored solutions, and oversee their effective execution.
  • • Furthermore, you will contribute to establishing a forward-thinking research agenda aimed at continuously improving Innodata's products and services. This includes championing the development and adoption of best practices and standards for generative AI development, both within client engagements and across the organization.
  • • This position offers a unique opportunity to make a significant impact in the rapidly evolving field of AI and GenAI, working with leading technology companies and contributing to cutting-edge solutions. Your expertise will directly influence the development of sophisticated AI models and drive substantial value for our customers.

🎯 Requirements

  • • Minimum of 5 years of relevant experience in data creation, curation, and analysis specifically for GenAI applications (e.g., RAG, Agents, complex reasoning).
  • • Proven experience in leading long-term projects, setting strategic plans, and driving them to success using knowledge of AI, data science, and process design excellence.
  • • Expertise in designing collection, evaluation, and quality assurance processes, utilizing both human-in-the-loop and synthetic data techniques.
  • • Strong understanding of machine learning, Large Language Models (LLMs), and Retrieval-Augmented Generation (RAG), coupled with a research-oriented mindset for developing long-term excellence.

🏖️ Benefits

  • • Opportunity to work with a leading data engineering company serving 4 out of the 5 largest technology companies globally.
  • • Engage in cutting-edge projects focused on Generative AI and advanced ML/AI technologies.
  • • Collaborate with a diverse, global team of over 3,000 employees across multiple countries.
  • • Potential for explosive growth and career advancement within the company over the next few years.

Skills & Technologies

Data Science
Senior
Remote

Ready to Apply?

You will be redirected to an external site to apply.

Innodata Inc. logo
Innodata Inc.
Visit Website

About Innodata Inc.

Innodata Inc. is a global digital services company that provides data engineering, data annotation, and content transformation solutions. They specialize in helping businesses leverage artificial intelligence and machine learning by preparing and structuring large datasets for training AI models. Their services cater to various industries, including technology, finance, healthcare, and automotive, enabling clients to improve operational efficiency, enhance customer experiences, and drive innovation through data-driven insights. Innodata's expertise lies in managing complex data challenges and delivering high-quality, scalable solutions.

Similar Opportunities

Remote
Full-time
Expires Mar 12, 2026
Data Science
Senior
Remote

2 months ago

Apply
❌ EXPIRED
Schnuck Markets, Inc. logo

Schnuck Markets, Inc.

Schnucks Store Support Center (Corporate Office)
Full-time
Expired Jan 30, 2026
Python
AWS
Data Science
+3 more

3 months ago

Apply
Remote
Full-time
Expires Mar 13, 2026
Python
TypeScript
FastAPI
+4 more

2 months ago

Apply
Remote
Full-time
Expires Apr 23, 2026
Python
PyTorch
Remote
+1 more

5 days ago

Apply