
Job Overview
Location
San Francisco
Job Type
Full-time
Category
Product Management
Date Posted
May 16, 2026
Full Job Description
đź“‹ Description
- • Design and maintain model policies across safety-relevant domains including dual-use, agentic systems, and emerging frontier-risk areas to align AI behavior with human values and norms.
- • Translate risk and harm models into clear behavioral specifications, evaluation criteria, grading guidance, and system-level safeguards for foundational AI models.
- • Define practical boundaries between beneficial AI uses and assistance that could enable harm, exploitation, misuse, or unsafe outcomes in high-risk or ambiguous contexts.
- • Build policy artifacts that directly support model training, evaluation, and deployment workflows, ensuring policies are technically grounded and enforceable at scale.
- • Partner with safety researchers, engineers, product teams, preparedness, and operations stakeholders to operationalize policies into scalable, measurable, and observable model behaviors.
- • Use red-teaming results, deployment data, model failures, over-refusals, under-refusals, and ambiguous edge cases to continuously refine policy quality and evaluation rigor.
- • Identify emerging capability areas where frontier AI systems may create new safety challenges or lower barriers to harm, anticipating risks before widespread deployment.
- • Study real-world model deployments to assess where behavior succeeds, fails, or drifts from intended safety postures, feeding insights back into policy iteration cycles.
- • Combine long-term safety research with hands-on launch and deployment activities to ensure policies are both forward-looking and practically implementable.
- • Contribute to system cards, safety reports, policy documentation, launch reviews, and external communications detailing OpenAI’s approach to model safety and risk mitigation.
- • Design and execute human data campaigns including gold set construction, labeling guidance, calibration protocols, adjudication processes, and eval coverage analysis to ensure reliable measurement and improvement of policies.
- • Develop evaluation criteria and grading frameworks that enable consistent, objective assessment of model behavior across diverse safety dimensions.
- • Work across technical, social, and adversarial domains to create policies that balance safety, user value, and implementation constraints without compromising legitimate applications of AI.
- • Maintain a pragmatic safety orientation focused on reducing real-world harm while preserving beneficial and socially valuable uses of AI.
- • Operate in fast-paced, collaborative environments where priorities shift dynamically based on model advancements, emerging evidence, and evolving risk landscapes.
- • Stay grounded in empirical results, implementation feasibility, and what can realistically be trained, measured, or enforced in large-scale AI systems.
- • Communicate complex tradeoffs clearly between safety, functionality, and operational limits to cross-functional teams and external stakeholders.
- • Apply systems-thinking across policy, data collection, graders, classifiers, training pipelines, deployment safeguards, monitoring, and escalation workflows.
- • Leverage OpenAI’s safety publications, evaluations hub, and system cards as foundational resources for policy development and continuous improvement.
Skills & Technologies
About OpenAI, Inc.
OpenAI is a San Francisco-based artificial intelligence research and deployment company founded in 2015. It develops large-scale AI models such as GPT, DALL-E, and Codex, providing cloud APIs and consumer applications like ChatGPT. Originally established as a non-profit, it later created a capped-profit subsidiary to attract capital while maintaining its mission to ensure artificial general intelligence benefits all of humanity.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities

Versaterm Inc.
2 months ago

Notion Labs, Inc.
2 days ago

MEWS Systems B.V.
23 days ago
