
Job Overview
Location
Remote job
Job Type
Full-time
Category
Data Science
Date Posted
February 23, 2026
Full Job Description
đź“‹ Description
- • As a Language Data Scientist at Innodata, you will be at the forefront of advancing Generative AI (GenAI) applications for our diverse clientele. This is a unique opportunity to join a dynamic team focused on leveraging cutting-edge ML/AI technologies to solve complex data challenges. You will play a pivotal role in shaping the future of AI by working hands-on with multi-modal and multi-lingual datasets, contributing directly to the development and refinement of sophisticated AI models.
- • Your primary responsibility will involve designing and continuously improving workflows essential for creating high-quality data used in AI/ML training and evaluation. This encompasses a broad spectrum of data generation strategies, including meticulously planned human annotation processes, robust data collection initiatives, and the innovative development of synthetic data workflows. You will be instrumental in ensuring the datasets are accurate, relevant, and optimized for model performance.
- • You will be expected to dive deep into existing workflows and processes, meticulously gathering data and extracting actionable insights. Based on your findings, you will formulate data-driven recommendations and drive significant improvements through innovation. This will involve close collaboration with cross-functional partners, including engineers, product managers, and client stakeholders, fostering a cohesive approach to problem-solving and solution implementation.
- • A critical aspect of your role will be to critically assess and evaluate annotation tooling and workflows. This includes identifying potential bottlenecks, suggesting enhancements, and ensuring the tools are efficient, user-friendly, and capable of producing high-fidelity annotations. Your expertise will help streamline the annotation process and improve the overall quality of labeled data.
- • You will perform quantitative analysis on large, complex datasets. This will involve applying statistical analysis techniques, calculating key performance metrics, and interpreting the results to identify trends and areas for improvement. Your analytical findings will directly inform recommendations aimed at enhancing model accuracy, performance, and overall effectiveness.
- • A significant part of your engagement will be working closely with client stakeholders. This involves understanding their overarching goals, meticulously gathering detailed requirements, proposing tailored solutions that align with their objectives, and then executing these solutions with precision and efficiency. Building strong relationships and clear communication channels with clients will be paramount to success.
- • You will contribute to the development and application of human evaluation tasks, ensuring that AI outputs meet stringent quality standards and align with human expectations. This requires a nuanced understanding of linguistic nuances and user experience.
- • The role demands a strong blend of computational linguistics, data science, and data engineering skills. You will leverage your linguistic acumen to understand the intricacies of language data and your data science expertise to build and analyze models, while your data engineering capabilities will ensure the data pipelines are robust and scalable.
- • Innodata is a leading data engineering company with a global presence, serving 4 out of 5 of the world's biggest technology companies, as well as leaders in financial services, insurance, technology, law, and medicine. You will be part of a team that is instrumental in ushering in the promise of AI by combining advanced ML/AI technologies, a global workforce of subject matter experts, and a high-security infrastructure.
- • This is a full-time, 40-hour per week, fixed-term contract position, offering a fantastic opportunity to gain extensive experience in the rapidly evolving field of GenAI within a reputable and growth-oriented organization.
🎯 Requirements
- • Bachelor's or Master's degree in Computer Science, Linguistics, Data Science, Statistics, or a related quantitative field.
- • Proven experience in data science, with a focus on natural language processing (NLP) and machine learning (ML).
- • Strong understanding of linguistic principles and experience with computational linguistics tasks.
- • Experience designing and implementing data annotation workflows, including human evaluation and synthetic data generation.
- • Proficiency in data analysis, statistical modeling, and relevant programming languages (e.g., Python).
- • Excellent communication and interpersonal skills, with the ability to collaborate effectively with cross-functional teams and client stakeholders.
🏖️ Benefits
- • Opportunity to work on cutting-edge GenAI projects with leading technology companies.
- • Gain hands-on experience with multi-modal and multi-lingual datasets.
- • Collaborative and innovative work environment.
- • Remote work flexibility within Canada (excluding Quebec).
- • 40-hour work week.
- • Fixed-term contract offering a defined project scope and clear deliverables.
Skills & Technologies
About Innodata Inc.
Innodata Inc. is a global digital services company that provides data engineering, data annotation, and content transformation solutions. They specialize in helping businesses leverage artificial intelligence and machine learning by preparing and structuring large datasets for training AI models. Their services cater to various industries, including technology, finance, healthcare, and automotive, enabling clients to improve operational efficiency, enhance customer experiences, and drive innovation through data-driven insights. Innodata's expertise lies in managing complex data challenges and delivering high-quality, scalable solutions.
Similar Opportunities

Calix, Inc.
3 months ago

Voyage Privé UK Ltd.
1 month ago

Token Metrics Ventures LLC
5 months ago
