Gramian logo

AI Evaluation Engineer (Software Engineering / Code)

Job Overview

Location

Colombia

Job Type

Contract

Category

Software Engineering

Date Posted

April 29, 2026

Full Job Description

📋 Description

  • • AI Evaluation Engineer specialized in software engineering to design benchmark tasks based on real-world coding workflows, ensuring AI systems can analyze, modify, and validate code changes accurately.
  • • Design and build multi-agent benchmark tasks using real-world code changes (bug fixes, migrations, refactors); work with the Harbor evaluation framework to run and validate tasks in containerized environments; write clear technical specifications and Python-based verification scripts; define task decomposition strategies across multiple agents; analyze large open-source codebases to extract realistic scenarios; run, debug, and refine tasks in Docker environments for reproducibility; improve task quality based on evaluation results.
  • • Gramian Consultancy is a boutique consultancy specializing in IT professional services and engineering talent solutions, helping companies build high-performing teams by matching them with professionals who truly fit their needs, with a strong background in software engineering and leadership.
  • • Develop expertise in AI evaluation frameworks, multi-agent systems, and large-scale code analysis; gain hands-on experience with Docker, Harbor, and benchmark design for LLMs; contribute to advancing AI coding evaluation standards while working remotely with a global team across multiple time zones.

Skills & Technologies

Python
JavaScript
Node.js
Django
Flask
Onsite

Ready to Apply?

You will be redirected to an external site to apply.

About Gramian

Gramian is a company focused on revolutionizing the agricultural industry through advanced technology. They specialize in developing and implementing data-driven solutions for farmers, aiming to optimize crop yields, reduce resource waste, and promote sustainable farming practices. Their platform integrates various data sources, including weather patterns, soil conditions, and satellite imagery, to provide actionable insights and predictive analytics. This enables farmers to make informed decisions regarding planting, irrigation, fertilization, and pest control, ultimately leading to increased efficiency and profitability. Gramian operates within the AgTech sector, contributing to the modernization and environmental responsibility of global agriculture.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

Virtual - New York
Full-time
Expires Jun 21, 2026
Senior
Hybrid
$94k-150k
+1 more

2 months ago

Apply
Expires soon
Remote - Ireland
Full-time
Expires Jun 13, 2026 (Soon)
Java
AWS
Azure
+4 more

2 months ago

Apply
Expired
Remote - United States
Full-time
Expired Jun 6, 2026
Python
React
Node.js
+2 more

2 months ago

Apply
Holman Enterprises logo

Holman Enterprises

Remote, NJ (US)
Full-time
Expires Jun 24, 2026
Python
JavaScript
Java
+3 more

1 month ago

Apply