
Job Overview
Location
Remote (USA)
Job Type
Contract
Category
Engineering
Date Posted
May 16, 2026
Full Job Description
đź“‹ Description
- • Evaluate and refine responses from advanced large language models (LLMs) specifically in the domain of mechanical engineering to ensure technical accuracy and quality.
- • Design high-difficulty mechanical engineering prompts that test the reasoning limits of AI models, focusing on thermodynamics, engineering mechanics, mechanics of materials, and fluid systems.
- • Identify reasoning errors, logical flaws, and failure modes in AI-generated responses to mechanical engineering problems.
- • Apply adversarial prompting techniques to expose gaps in model understanding and push models beyond their known capabilities.
- • Provide expert-level, detailed feedback on AI model outputs, correcting inaccuracies in equations, assumptions, units, and engineering principles.
- • Collaborate with project leads to establish and maintain consistent quality standards for mechanical engineering content in AI training datasets.
- • Review and validate the correctness of numerical solutions, simulation results, and design interpretations generated by AI systems.
- • Contribute to the development of benchmark problems and edge-case scenarios that challenge AI models’ depth of mechanical engineering knowledge.
- • Ensure technical communication in AI responses is precise, unambiguous, and aligned with standard engineering terminology and practices.
- • Maintain rigorous attention to detail in assessing both conceptual understanding and computational accuracy in AI-generated content.
- • Work remotely with flexible hours, contributing as much or as little as desired each week, with a maximum cap of 40 hours per week.
- • No minimum weekly hour requirement; work on a project-based, as-needed basis.
- • Operate independently while adhering to defined evaluation rubrics and quality control protocols established by the project team.
- • Participate in periodic feedback sessions to align evaluation criteria with evolving model capabilities and domain-specific requirements.
- • Maintain confidentiality of proprietary prompts, datasets, and model outputs used in AI training workflows.
- • Apply domain expertise to distinguish between plausible but incorrect reasoning and truly erroneous conclusions in AI outputs.
- • Translate complex mechanical engineering concepts into clear evaluation guidelines for non-expert annotators when necessary.
- • Document patterns in model failures to inform future prompt design and model fine-tuning strategies.
- • Ensure all evaluations reflect current industry standards and academic best practices in mechanical engineering education and application.
🎯 Requirements
- • PhD or MS in Mechanical Engineering or closely related fields like civil, aerospace, structural, or chemical engineering.
- • Solid understanding of mechanical engineering principles, design practices, and simulation tools.
- • Prior experience with prompt engineering and stumping expert-level models (max reasoning/tokens).
- • Prior experience inducing or identifying reasoning-related errors.
- • Excellent attention to detail, especially in technical communication and numerical accuracy.
- • Based out of the US, Canada, Mexico, UK, or Spain.
🏖️ Benefits
- • Fully remote work with no requirement to relocate.
- • Flexible hours — no minimum weekly commitment, capped at 40 hours per week.
- • Opportunity to contribute to cutting-edge AI development in engineering education and application.
- • No visa sponsorship available.
Skills & Technologies
About Handshake Technologies, Inc.
Handshake Technologies provides a cloud-based career-services platform that connects university students, recent graduates, and employers. The software enables institutions to manage job postings, career fairs, on-campus interviews, and employer relations while giving students tools to discover internships and entry-level roles and giving employers access to early-career talent across a network of partner colleges and universities.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities

ShipBob, Inc.
3 months ago

Deepgram Inc.
3 months ago

Credit Acceptance Corporation
3 months ago