
Job Overview
Location
Brazil
Job Type
Contract
Category
Software Engineering
Date Posted
June 4, 2026
Full Job Description
đź“‹ Description
- • Own the production ML API that serves Next Best Action (NBA) recommendations, ensuring low-latency performance and reliability for live traffic.
- • Ship end-to-end LLM agent tools: design schema, implement handlers, enforce structured-error contracts (RATE_LIMITED, UPSTREAM_ERROR, NOT_FOUND), write unit tests, and deploy via HAL’s agent runtime.
- • Build the evaluation foundation for AI agents using golden transcripts, rubric-based judges, and regression suites that run automatically on every prompt or model change.
- • Partner with the HAL platform team to become the data team’s primary point of contact for agent infrastructure decisions and runtime integration.
- • Serve as the primary owner of the ML API and agent tool layer alongside the data engineer, ensuring scalability, observability, and multi-tenant gating.
- • Ship at least one production-grade customer-facing or partner-facing agent with prompt versioning, evals, observability, and access controls in place.
- • Define and document the data team’s standardized playbook for shipping new ML models as LLM-callable tools, from training to production deployment.
- • Mentor data engineers on ML/AI engineering patterns, enabling them to independently support and extend systems you build.
- • Operate as the technical lead for NBA production AI at Clutch, serving as the go-to expert for other teams on how ML and agents are deployed responsibly.
- • Measure and improve agent performance: target a 30%+ reduction in P95 latency or per-conversation cost on at least one agent.
- • Shape the data team’s roadmap for next-generation ML and AI products in partnership with the product manager and data scientist.
- • Contribute to hiring decisions as the team scales, identifying needed skills and roles to support future growth.
- • Implement tool handlers with identity fields sourced exclusively from request context (never LLM-supplied args), re-parse outputs through strict schemas, and throw structured errors on all failure paths.
- • Debug agent behavior by analyzing system prompts, tool descriptions, and guardrails before considering model swaps — treat prompts as version-controlled code.
- • Maintain low-latency APIs using FastAPI, BentoML, or equivalent, with strong opinions on serving, batching, and caching strategies.
- • Work within AWS infrastructure, including Lambda, Docker containers, and GitHub-based CI/CD workflows.
- • Use AI tooling actively in daily engineering workflows — not as a novelty, but as a default practice.
- • Read and interpret system prompts with the same rigor as code: analyze audience, register, compliance guardrails, template-variable allow-lists, and allowed-tools sections.
- • Maintain production agent observability by monitoring audit logs, distributed traces, per-tool latency, and error metrics.
- • Apply cost and latency tradeoff intuition to optimize agent loops in live environments.
- • Support multi-tenant agent gating and access control systems for customer and partner-facing agents.
🎯 Requirements
- • 7+ years of engineering experience with a proven track record of building and shipping production ML systems
- • Strong Python proficiency; comfort with TypeScript for tool contracts and agent runtime integration
- • Tool-design discipline for LLM consumption: narrow input/output schemas, identity-required dispatch, structured-error contracts
- • Eval discipline for non-deterministic systems: golden transcripts, rubric-based judges, regression suites on every prompt/model change
- • Prompt-shape literacy: treat prompts as version-controlled code, debug agent behavior by analyzing prompt structure
- • Experience building and maintaining low-latency production APIs (FastAPI, BentoML, or equivalent) with AWS, Docker, and GitHub workflows
🏖️ Benefits
- • Remote Flexibility: Work from anywhere with full remote work freedom
- • Unforgettable Off-Sites: Two company-paid off-sites per year to bond with colleagues
- • Paid Time Off and National Holidays: 20 PTO days annually plus national holidays
- • Stock Options: Receive equity as part of compensation package
- • Home Office Setup: Dedicated budget for home office essentials
- • Work Trip Budget: Budget allocated for work-related trips and co-working
Skills & Technologies
About Clutch Technologies, Inc.
Clutch Technologies operates a digital platform that lets consumers refinance auto loans, secure lower rates, and manage vehicle financing online. Established in 2016 and headquartered in San Francisco, the company aggregates lender offers, handles title transfers, and provides customer support throughout the refinancing process. Its technology streamlines loan applications, credit checks, and contract e-signatures, aiming to reduce monthly payments for borrowers across the United States.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.



