Cohere Inc. logo

Research Internship Reinforcement Learning (Summer)

Job Overview

Location

Paris

Job Type

Full-time

Category

Data Science

Date Posted

April 28, 2026

Full Job Description

📋 Description

  • Research Internship in Reinforcement Learning at Cohere Inc., focused on advancing large language model training through self-distillation and reinforcement learning with verifiable rewards (RLVR), with applications to code generation and agentic tasks.
  • Day-to-day responsibilities include conducting literature reviews, implementing state-of-the-art RL and self-distillation algorithms, designing and executing experiments on code and agentic tasks, developing and maintaining codebases for theoretical and practical work, collaborating with researchers to analyze results and prepare publications, and contributing to mechanisms for handling large rollouts such as summarization and hierarchical sub-agents.
  • Cohere is a team of world-class researchers, engineers, and designers dedicated to scaling intelligence to serve humanity through frontier AI models that power magical experiences like content generation, semantic search, RAG, and agents. The company values hard work, speed, customer impact, and diverse perspectives as essential to building great products.
  • The intern will gain hands-on experience in cutting-edge LLM research, deepen expertise in reinforcement learning and self-distillation, contribute to publishable research, collaborate with leading AI researchers, and develop skills in experimental design, implementation, and scientific communication in a high-impact research environment.

Skills & Technologies

Python
TensorFlow
PyTorch
Junior
Remote
Degree Required

Ready to Apply?

You will be redirected to an external site to apply.

Cohere Inc. logo
Cohere Inc.
Visit Website

About Cohere Inc.

Cohere provides large language models and retrieval-augmented generation APIs for enterprise developers to embed conversational AI, search, summarization, and content generation into applications. Founded in 2021 by former Google Brain researchers, the company offers cloud and on-premise deployment, fine-tuning tools, and multilingual support to help organizations automate workflows, improve customer support, and analyze unstructured data while maintaining data privacy and security controls.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

Expired
Signet Jewelers Limited logo

Signet Jewelers Limited

2 Locations
Full-time
Expired May 30, 2026
Python
AWS
Pandas
+4 more

2 months ago

Apply
Astronomer, Inc. logo

Astronomer, Inc.

Chicago
Full-time
Expires Jul 20, 2026
Python
AWS
Azure
+4 more

17 days ago

Apply
Remote, USA
Full-time
Expires Jul 6, 2026
Go
Remote
Degree Required

1 month ago

Apply
Expired
Remote
Full-time
Expired May 3, 2026
Python
JavaScript
Pandas
+3 more

3 months ago

Apply