Innodata Inc. logo

Language Data Scientist

Job Overview

Location

Remote job

Job Type

Full-time

Category

Data Science

Date Posted

February 23, 2026

Full Job Description

đź“‹ Description

  • • As a Language Data Scientist at Innodata, you will be at the forefront of advancing Generative AI (GenAI) applications for our diverse clientele. This is a unique opportunity to join a dynamic team focused on leveraging cutting-edge ML/AI technologies to solve complex data challenges. You will play a pivotal role in shaping the future of AI by working hands-on with multi-modal and multi-lingual datasets, contributing directly to the development and refinement of sophisticated AI models.
  • • Your primary responsibility will involve designing and continuously improving workflows essential for creating high-quality data used in AI/ML training and evaluation. This encompasses a broad spectrum of data generation strategies, including meticulously planned human annotation processes, robust data collection initiatives, and the innovative development of synthetic data workflows. You will be instrumental in ensuring the datasets are accurate, relevant, and optimized for model performance.
  • • You will be expected to dive deep into existing workflows and processes, meticulously gathering data and extracting actionable insights. Based on your findings, you will formulate data-driven recommendations and drive significant improvements through innovation. This will involve close collaboration with cross-functional partners, including engineers, product managers, and client stakeholders, fostering a cohesive approach to problem-solving and solution implementation.
  • • A critical aspect of your role will be to critically assess and evaluate annotation tooling and workflows. This includes identifying potential bottlenecks, suggesting enhancements, and ensuring the tools are efficient, user-friendly, and capable of producing high-fidelity annotations. Your expertise will help streamline the annotation process and improve the overall quality of labeled data.
  • • You will perform quantitative analysis on large, complex datasets. This will involve applying statistical analysis techniques, calculating key performance metrics, and interpreting the results to identify trends and areas for improvement. Your analytical findings will directly inform recommendations aimed at enhancing model accuracy, performance, and overall effectiveness.
  • • A significant part of your engagement will be working closely with client stakeholders. This involves understanding their overarching goals, meticulously gathering detailed requirements, proposing tailored solutions that align with their objectives, and then executing these solutions with precision and efficiency. Building strong relationships and clear communication channels with clients will be paramount to success.
  • • You will contribute to the development and application of human evaluation tasks, ensuring that AI outputs meet stringent quality standards and align with human expectations. This requires a nuanced understanding of linguistic nuances and user experience.
  • • The role demands a strong blend of computational linguistics, data science, and data engineering skills. You will leverage your linguistic acumen to understand the intricacies of language data and your data science expertise to build and analyze models, while your data engineering capabilities will ensure the data pipelines are robust and scalable.
  • • Innodata is a leading data engineering company with a global presence, serving 4 out of 5 of the world's biggest technology companies, as well as leaders in financial services, insurance, technology, law, and medicine. You will be part of a team that is instrumental in ushering in the promise of AI by combining advanced ML/AI technologies, a global workforce of subject matter experts, and a high-security infrastructure.
  • • This is a full-time, 40-hour per week, fixed-term contract position, offering a fantastic opportunity to gain extensive experience in the rapidly evolving field of GenAI within a reputable and growth-oriented organization.

🎯 Requirements

  • • Bachelor's or Master's degree in Computer Science, Linguistics, Data Science, Statistics, or a related quantitative field.
  • • Proven experience in data science, with a focus on natural language processing (NLP) and machine learning (ML).
  • • Strong understanding of linguistic principles and experience with computational linguistics tasks.
  • • Experience designing and implementing data annotation workflows, including human evaluation and synthetic data generation.
  • • Proficiency in data analysis, statistical modeling, and relevant programming languages (e.g., Python).
  • • Excellent communication and interpersonal skills, with the ability to collaborate effectively with cross-functional teams and client stakeholders.

🏖️ Benefits

  • • Opportunity to work on cutting-edge GenAI projects with leading technology companies.
  • • Gain hands-on experience with multi-modal and multi-lingual datasets.
  • • Collaborative and innovative work environment.
  • • Remote work flexibility within Canada (excluding Quebec).
  • • 40-hour work week.
  • • Fixed-term contract offering a defined project scope and clear deliverables.

Skills & Technologies

Data Science
Remote

Ready to Apply?

You will be redirected to an external site to apply.

Innodata Inc. logo
Innodata Inc.
Visit Website

About Innodata Inc.

Innodata Inc. is a global digital services company that provides data engineering, data annotation, and content transformation solutions. They specialize in helping businesses leverage artificial intelligence and machine learning by preparing and structuring large datasets for training AI models. Their services cater to various industries, including technology, finance, healthcare, and automotive, enabling clients to improve operational efficiency, enhance customer experiences, and drive innovation through data-driven insights. Innodata's expertise lies in managing complex data challenges and delivering high-quality, scalable solutions.

Similar Opportunities

❌ EXPIRED
Remote - USA
Full-time
Expired Jan 22, 2026
Python
AWS
Azure
+6 more

3 months ago

Apply
Aix-en-Provence
Full-time
Expires Mar 25, 2026
Mobile
Onsite
Remote

1 month ago

Apply
❌ EXPIRED
Houston, TX
Full-time
Expired Nov 17, 2025
JavaScript
React
Node.js
+4 more

5 months ago

Apply
Toronto
Full-time
Expires Mar 11, 2026
Python
Azure
TensorFlow
+3 more

2 months ago

Apply