Scorecard/OpenAI

OpenAI

Data gathering in process

AI research and deployment company. Developer of GPT-4, ChatGPT, and DALL-E.

HQUS

Est2015

Size1001-5000

EU AI ActGPAI Systemic Risk

openai.com

Score

59.1 / 100

Evidence

5 items

Strong safety posture with established governance frameworks and active risk management.

Strengths:Governance Maturity, Technical Safety, Risk Assessment, Regulatory Readiness, External Engagement

Focus Areas

foundation modelsenterprise aiconsumer aireasoning

Safety Profile

Security Assessment

Security-relevant indicators for vendor evaluation

Security Posture

TS-01dim: 65

Red Teaming & Pre-deployment Testing

Adversarial testing before deployment

TS-05dim: 65

Robustness & Adversarial Resilience

Resistance to adversarial attacks

RA-01dim: 52

Sector-Specific Risk Assessment

Risk analysis for deployment context

RA-03dim: 52

Dual-Use & Misuse Risk

Dangerous capability awareness

RA-07dim: 52

Incident History & Track Record

Past incidents and response quality

EE-04dim: 60

Vulnerability Disclosure Program

Bug bounty or CVE reporting process

Incident History

OpenAI incident records sourced from AIAAIC Repository and public reporting.

Integration: AIAAIC, OECD AI Incidents Monitor

Third-Party Audits

External audit reports, SOC 2 attestations, and ISO certifications verified where published.

Sources: Company filings, registry lookups

CVE & Disclosures

Known vulnerabilities and security advisories from NVD, GitHub Security Advisories, and vendor pages.

Sources: NVD, GHSA, vendor disclosure pages

Dimension Breakdown

Governance Maturitymedium

Published policies, corporate structure, safety mandate, whistleblowing, executive commitment.

2 evidence items

GM-01GM-03

Technical Safetymedium

Benchmarks, adversarial robustness, fine-tuning safety, watermarking, model cards, research output.

2 evidence items

TS-01TS-07

Risk Assessmentlow

Dangerous capability evaluations, thresholds, external testing, bug bounty, halt conditions.

Regulatory Readinesslow

ISO 42001, EU AI Act compliance, GPAI obligations, international commitments, incident reporting.

External Engagementmedium

Survey participation, research support, transparency, behavior specs, open-source contributions.

1 evidence items

EE-06

Social Impact & Safety Profile

Limited

OpenAI dissolved its Superalignment team and lost key safety researchers including Jan Leike and Ilya Sutskever. The nonprofit-to-profit restructuring raised fundamental questions about governance accountability. The Preparedness Framework has been weakened in practice, and commercial pressures increasingly override safety commitments. System cards and usage policies exist but lack independent verification, and transparency has declined significantly.

governance deteriorationsafety team dissolutioncommercial pressurereduced transparency

Why it matters for safety

OpenAI's governance controversies (board crisis, safety team departures) make it the most important case study in AI safety governance. Whether OpenAI's remaining safety structures are sufficient under extreme commercial pressure is the central question.

Civilizational Risk Awareness

1/3

Charter references catastrophic risk but organisational behaviour has diverged significantly from stated risk awareness. The gap between rhetoric and action is the widest in the frontier lab category. Board crisis, safety team departures, and the for-profit transition collectively demonstrate that risk awareness is not structurally embedded.

Responsible Scaling Policy

Informal

Preparedness Framework (2023) - defines risk categories and evaluation criteria for frontier model deployment. Risk levels (Low/Medium/High/Critical) with deployment thresholds.

Framework exists on paper but enforcement credibility has been severely undermined by senior safety team departures, the dissolution of the superalignment team, and governance instability. Downgraded from 'published' to 'informal' because a published policy without credible enforcement is functionally informal.

Mission Drift Protection

1/3

✓Mission statement in charter (AGI that benefits all of humanity)
✓Safety Advisory Group
✓Preparedness Framework gates

○Capped-profit structure being restructured - mission protection unclear in new corporate form
○Board crisis demonstrated governance failure under commercial pressure
○No PBC status - transition to for-profit removes structural mission protection
○Multiple senior safety researchers departed
○No independent external safety board with binding authority

Vulnerability Disclosure

External

Public bug bounty programme via Bugcrowd. Covers traditional security vulnerabilities and some AI-specific issues (jailbreaks, safety bypasses).

Bug bounty exists but scope for AI-specific safety vulnerabilities is narrower than Anthropic's programme. Downgraded from 'public_bug_bounty' to 'external_programme' because AI-safety-specific vulnerability coverage is limited compared to the breadth of traditional security bounty.

Safety Reporting

◇ Irregular

System cardsper major release

Safety research publicationsirregular

Preparedness evaluationsper major release

Reporting is tied to product releases rather than on a regular cadence. Publication frequency of safety research has decreased compared to 2022-2023. System cards are informative but not comprehensive safety assessments. No structured transparency report.

Dual-Use Risk

SignificantAI×Cyber OffensiveAI×BioAI×Information Manipulation

Dual-use mitigation structures exist but institutional commitment has been questioned. The gap between formal policy and organisational culture is the concern, not the absence of policies.

Mitigation details

✓Preparedness Framework with risk categories

✓Red-teaming programme

✓Usage policies and monitoring

✓Government engagement on dual-use risk

○No PBC or structural commitment preventing deprioritisation of safety

○Senior safety team departures raise questions about institutional commitment

○No independent dual-use review board

Need a detailed report for OpenAI?

Subscribe to express interest in indicator-level evidence, peer benchmarking, and regulatory gap analysis - or reach out to request a full company overview brief.

Subscribe for Updates Request a Brief

Scoring methodology v0.1 - View full rubric