
Job Overview
Location
Paris
Job Type
Full-time
Category
Data Science
Date Posted
April 28, 2026
Full Job Description
📋 Description
- • Research Internship in Reinforcement Learning at Cohere Inc., focused on advancing large language model training through self-distillation and reinforcement learning with verifiable rewards (RLVR), with applications to code generation and agentic tasks.
- • Day-to-day responsibilities include conducting literature reviews, implementing state-of-the-art RL and self-distillation algorithms, designing and executing experiments on code and agentic tasks, developing and maintaining codebases for theoretical and practical work, collaborating with researchers to analyze results and prepare publications, and contributing to mechanisms for handling large rollouts such as summarization and hierarchical sub-agents.
- • Cohere is a team of world-class researchers, engineers, and designers dedicated to scaling intelligence to serve humanity through frontier AI models that power magical experiences like content generation, semantic search, RAG, and agents. The company values hard work, speed, customer impact, and diverse perspectives as essential to building great products.
- • The intern will gain hands-on experience in cutting-edge LLM research, deepen expertise in reinforcement learning and self-distillation, contribute to publishable research, collaborate with leading AI researchers, and develop skills in experimental design, implementation, and scientific communication in a high-impact research environment.
Skills & Technologies
About Cohere Inc.
Cohere provides large language models and retrieval-augmented generation APIs for enterprise developers to embed conversational AI, search, summarization, and content generation into applications. Founded in 2021 by former Google Brain researchers, the company offers cloud and on-premise deployment, fine-tuning tools, and multilingual support to help organizations automate workflows, improve customer support, and analyze unstructured data while maintaining data privacy and security controls.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities

Signet Jewelers Limited
2 months ago

Constructor Technologies, Inc.
3 months ago

