
Job Overview
Location
Remote (USA)
Job Type
Contract
Category
Product Management
Date Posted
May 16, 2026
Full Job Description
đź“‹ Description
- • Design and evaluate high-difficulty Civil Engineering prompts that challenge the reasoning limits of large language models (LLMs) in civil engineering domains.
- • Identify reasoning errors, logical gaps, and failure modes in AI-generated responses related to structural mechanics, geotechnical engineering, transportation engineering, and fluid dynamics.
- • Apply adversarial prompting techniques to expose weaknesses in model understanding and push AI systems beyond surface-level accuracy.
- • Provide expert-level, detailed feedback on AI model outputs to correct technical inaccuracies, incorrect formulas, flawed assumptions, and misapplied design standards.
- • Ensure technical precision in numerical calculations, code-based solutions, and regulatory references (e.g., AASHTO, ACI, FEMA) within model responses.
- • Collaborate with project leads to establish and maintain quality benchmarks for Civil Engineering content across AI training datasets.
- • Contribute to the refinement of evaluation rubrics used to score AI responses for correctness, completeness, and pedagogical clarity.
- • Review and validate model outputs against established engineering principles, industry codes, and peer-reviewed literature.
- • Flag inconsistencies between model-generated solutions and real-world engineering practice, including oversimplifications or missing boundary conditions.
- • Document specific failure patterns observed in AI responses to inform iterative improvements in model training and alignment.
- • Maintain strict attention to detail in technical communication, ensuring all feedback is clear, actionable, and free of ambiguity.
- • Work independently to assess complex engineering problems presented in open-ended prompts, with no reliance on standard answer keys.
- • Contribute to the creation of novel problem scenarios that test advanced application of civil engineering concepts under constraints.
- • Participate in periodic alignment sessions to ensure consistency in evaluation standards among other specialist reviewers.
- • Adhere to data quality protocols that prioritize accuracy over volume, prioritizing depth of analysis over quantity of inputs.
- • Ensure all feedback is grounded in accepted civil engineering practices and does not introduce speculative or non-standard interpretations.
- • Operate in a fully remote, asynchronous environment with flexible weekly hours (capped at 40 hours/week; no minimum requirement).
- • Maintain confidentiality of proprietary AI datasets, prompts, and evaluation methodologies provided by Handshake Technologies, Inc.
- • Support the broader mission of improving AI reliability in STEM education and professional engineering contexts through rigorous data curation.
Skills & Technologies
About Handshake Technologies, Inc.
Handshake Technologies provides a cloud-based career-services platform that connects university students, recent graduates, and employers. The software enables institutions to manage job postings, career fairs, on-campus interviews, and employer relations while giving students tools to discover internships and entry-level roles and giving employers access to early-career talent across a network of partner colleges and universities.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities

OpenAI, Inc.
3 months ago


