This job has expired

This position was posted on May 22, 2026 and is likely no longer accepting applications. We've kept it here for historical reference. Check out the similar jobs below!

AI Engineer

Ruby Labs Ltd.

Job Overview

Location

Ukraine

Job Type

Full-time

Full Job Description

📋 Description

• Design and implement advanced, dynamic prompt templates with conditional logic and context injection to maximize LLM generation quality and reasoning accuracy.
• Develop and enforce structured output schemas using JSON mode, function calling, Zod, and JSON schemas to ensure AI responses are predictable and seamlessly integrable into application logic.
• Build and maintain robust evaluation pipelines using Langfuse to collect real-time feedback, score response quality, and track performance metrics across LLM outputs.
• Perform deep debugging of complex LLM chains via Langfuse traces to identify bottlenecks, optimize context window usage, reduce latency, and lower operational costs.
• Conduct systematic AI A/B testing across multiple models (e.g., Claude 3.5 Sonnet, GPT-4o) via OpenRouter, analyzing results using quantitative benchmarks to guide model selection.
• Make all deployment decisions for prompts and models strictly based on quantitative trace data and performance metrics, eliminating intuition-driven changes.
• Create custom scoring systems to analyze the "Problem → Solution" chain in LLM outputs, identifying root causes of hallucinations, logical errors, and inconsistency.
• Continuously re-evaluate model performance as new architectures emerge and execute fine-tuning when domain-specific accuracy or compliance requirements demand it.
• Utilize Node.js and Next.js to build scalable, production-ready services that handle complex LLM-generated data structures and high-throughput inference workflows.
• Work with unified AI APIs like OpenRouter to manage rate limits, model routing, cost optimization, and fallback strategies across multiple LLM providers.
• Implement LLM observability practices including tracing, test dataset creation, and integration of scoring systems using Langfuse or equivalent tools.
• Apply evaluation frameworks such as RAGAS or build custom "LLM-as-a-judge" systems to quantitatively assess response quality and alignment with intended outcomes.
• Transform raw LLM generation logs into actionable business metrics and technical insights that drive product improvements and operational efficiency.
• Maintain an iterative mindset, continuously refining AI features through feedback loops, real-world usage data, and performance analytics.
• Own end-to-end delivery of key AI features from experimentation and prototyping to full production deployment and monitoring.
• Collaborate within a globally distributed team operating within ±4 hours of CET to ensure real-time communication and alignment during core working hours.

🎯 Requirements

• Deep knowledge of Node.js and Next.js for building reliable services handling complex LLM-generated data
• Proven experience in dynamic prompt engineering with context injection and conditional logic
• Hands-on experience with OpenRouter for managing multi-model routing, rate limits, and cost optimization
• Practical experience with Langfuse or similar LLM observability platforms for tracing, evaluation, and scoring
• Experience with LLM evaluation frameworks such as RAGAS or custom "LLM-as-a-judge" systems
• Strong analytical mindset to convert raw LLM logs into actionable technical and business metrics

🏖️ Benefits

• Remote Work Environment: Work from anywhere with full flexibility to promote work-life balance
• Unlimited PTO: Unlimited paid time off to recharge without tracking days
• Paid National Holidays: Paid time off for recognized national holidays
• Company-provided MacBook: Top-tier Apple MacBook provided to all employees who need one
• Flexible Independent Contractor Agreement: Access to tax advantages, autonomy, networking, and reduced employment obligations

Skills & Technologies

Python

JavaScript

TypeScript

Ruby

Data Science

Remote

Ready to Apply?

Apply Externally

You will be redirected to an external site to apply.

AI Job Fit Analysis

Pro

See exactly how your profile matches this role — strengths, skill gaps, and what to do about them.

Ruby Labs Ltd.

Visit Website

About Ruby Labs Ltd.

Ruby Labs Ltd. is a London-based product studio that builds and scales consumer subscription mobile and web applications. The company focuses on health, wellness, and productivity verticals, developing apps such as Hint, Able, and the award-winning fitness platform FitCoach. Using data-driven growth and proprietary technology, Ruby Labs rapidly prototypes, launches, and iterates products to serve millions of global users. The team combines engineering, product design, and performance marketing expertise to create sustainable digital businesses. Founded in 2018, Ruby Labs operates a portfolio of self-funded apps, emphasizing user privacy, scientific validation, and long-term customer value.

View Company Profile

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.