Scorecard/Patronus AI

Patronus AI

Data gathering in process

AI evaluation and safety testing platform. Automated red teaming and scoring.

HQUS

Est2023

Size11-50

EU AI ActLimited Risk

patronus.ai

Score

47.6 / 100

Evidence

3 items

Developing safety practices - core foundations in place with room for improvement.

Strengths:Technical Safety, External Engagement

Weaknesses:Governance Maturity, Risk Assessment, Regulatory Readiness

Focus Areas

ai safety toolingevaluationred teaming

Safety Profile

Security Assessment

Security-relevant indicators for vendor evaluation

Security Posture

TS-01dim: 60

Red Teaming & Pre-deployment Testing

Adversarial testing before deployment

TS-05dim: 60

Robustness & Adversarial Resilience

Resistance to adversarial attacks

RA-01dim: 45

Sector-Specific Risk Assessment

Risk analysis for deployment context

RA-03dim: 45

Dual-Use & Misuse Risk

Dangerous capability awareness

RA-07dim: 45

Incident History & Track Record

Past incidents and response quality

EE-04dim: 55

Vulnerability Disclosure Program

Bug bounty or CVE reporting process

Incident History

Patronus AI incident records sourced from AIAAIC Repository and public reporting.

Integration: AIAAIC, OECD AI Incidents Monitor

Third-Party Audits

External audit reports, SOC 2 attestations, and ISO certifications verified where published.

Sources: Company filings, registry lookups

CVE & Disclosures

Known vulnerabilities and security advisories from NVD, GitHub Security Advisories, and vendor pages.

Sources: NVD, GHSA, vendor disclosure pages

Dimension Breakdown

Governance Maturitymedium

Published policies, corporate structure, safety mandate, whistleblowing, executive commitment.

Technical Safetymedium

Benchmarks, adversarial robustness, fine-tuning safety, watermarking, model cards, research output.

2 evidence items

TS-01TS-02

Risk Assessmentlow

Dangerous capability evaluations, thresholds, external testing, bug bounty, halt conditions.

Regulatory Readinesslow

ISO 42001, EU AI Act compliance, GPAI obligations, international commitments, incident reporting.

External Engagementmedium

Survey participation, research support, transparency, behavior specs, open-source contributions.

1 evidence items

EE-06

Social Impact & Safety Profile

Moderate

Patronus AI builds evaluation and benchmarking tools that help organisations measure AI safety before deployment. Their hallucination detection and safety scoring tools directly reduce the risk of harmful AI outputs reaching users. Social impact is embedded in product design, though formal social impact policies are not yet published.

safety evaluationhallucination detectiondeployment safety

Why it matters for safety

Without rigorous evaluation, safety claims are aspirational. Patronus provides the testing infrastructure that makes AI safety measurable and verifiable for enterprise deployments.

Civilizational Risk Awareness

1/3

Practical safety orientation through evaluation tooling. Commercial motivation rather than existential risk framing. The work is highly relevant to safety infrastructure but not explicitly motivated by catastrophic risk.

Responsible Scaling Policy

None

No RSP. As an evaluation tooling company, an RSP is not directly applicable. The equivalent is governance of how evaluation results are used and whether they can be gamed or misrepresented.

Mission Drift Protection

1/3

✓Safety-adjacent positioning in AI evaluation market

○No PBC status
○No structural governance mechanisms
○Commercial evaluation focus could drift toward capability benchmarking over safety

Vulnerability Disclosure

None

No formal CVD programme. Relevant vulnerabilities would include: evaluation metrics that give false safety confidence, or bypasses that allow unsafe models to pass evaluation.

Safety Reporting

◇ Irregular

Research publicationsirregular

Blog posts on evaluation methodologyirregular

Irregular research publications. No structured safety report. For an evaluation company, publishing aggregate data on AI failure patterns across evaluations would be highly valuable to the safety ecosystem.

Dual-Use Risk

Not applicable - this company does not develop dual-use AI systems.

Need a detailed report for Patronus AI?

Subscribe to express interest in indicator-level evidence, peer benchmarking, and regulatory gap analysis - or reach out to request a full company overview brief.

Subscribe for Updates Request a Brief

Scoring methodology v0.1 - View full rubric