Scorecard/Patronus AI

Patronus AI

Data gathering in process

AI evaluation and safety testing platform. Automated red teaming and scoring.

HQUS
Est2023
Size11-50
EU AI ActLimited Risk
patronus.ai
Score
47.6 / 100
Evidence
3 items

Developing safety practices - core foundations in place with room for improvement.

Strengths:Technical Safety, External Engagement
Weaknesses:Governance Maturity, Risk Assessment, Regulatory Readiness
Focus Areas
ai safety toolingevaluationred teaming

Security Assessment

Security-relevant indicators for vendor evaluation

Security Posture
53
TS-01dim: 60
Red Teaming & Pre-deployment Testing
Adversarial testing before deployment
TS-05dim: 60
Robustness & Adversarial Resilience
Resistance to adversarial attacks
RA-01dim: 45
Sector-Specific Risk Assessment
Risk analysis for deployment context
RA-03dim: 45
Dual-Use & Misuse Risk
Dangerous capability awareness
RA-07dim: 45
Incident History & Track Record
Past incidents and response quality
EE-04dim: 55
Vulnerability Disclosure Program
Bug bounty or CVE reporting process
Incident History
Patronus AI incident records sourced from AIAAIC Repository and public reporting.
Integration: AIAAIC, OECD AI Incidents Monitor
Third-Party Audits
External audit reports, SOC 2 attestations, and ISO certifications verified where published.
Sources: Company filings, registry lookups
CVE & Disclosures
Known vulnerabilities and security advisories from NVD, GitHub Security Advisories, and vendor pages.
Sources: NVD, GHSA, vendor disclosure pages

Dimension Breakdown

GM
Governance Maturitymedium
Published policies, corporate structure, safety mandate, whistleblowing, executive commitment.
42
TS
Technical Safetymedium
Benchmarks, adversarial robustness, fine-tuning safety, watermarking, model cards, research output.
60
2 evidence items
TS-01TS-02
RA
Risk Assessmentlow
Dangerous capability evaluations, thresholds, external testing, bug bounty, halt conditions.
45
RR
Regulatory Readinesslow
ISO 42001, EU AI Act compliance, GPAI obligations, international commitments, incident reporting.
38
EE
External Engagementmedium
Survey participation, research support, transparency, behavior specs, open-source contributions.
55
1 evidence items
EE-06

Social Impact & Safety Profile

Moderate

Patronus AI builds evaluation and benchmarking tools that help organisations measure AI safety before deployment. Their hallucination detection and safety scoring tools directly reduce the risk of harmful AI outputs reaching users. Social impact is embedded in product design, though formal social impact policies are not yet published.

safety evaluationhallucination detectiondeployment safety
Why it matters for safety

Without rigorous evaluation, safety claims are aspirational. Patronus provides the testing infrastructure that makes AI safety measurable and verifiable for enterprise deployments.

Civilizational Risk Awareness

1/3

Practical safety orientation through evaluation tooling. Commercial motivation rather than existential risk framing. The work is highly relevant to safety infrastructure but not explicitly motivated by catastrophic risk.

Responsible Scaling Policy

None

No RSP. As an evaluation tooling company, an RSP is not directly applicable. The equivalent is governance of how evaluation results are used and whether they can be gamed or misrepresented.

Mission Drift Protection

1/3
  • Safety-adjacent positioning in AI evaluation market
  • No PBC status
  • No structural governance mechanisms
  • Commercial evaluation focus could drift toward capability benchmarking over safety

Vulnerability Disclosure

None

No formal CVD programme. Relevant vulnerabilities would include: evaluation metrics that give false safety confidence, or bypasses that allow unsafe models to pass evaluation.

Safety Reporting

◇ Irregular
Research publicationsirregular
Blog posts on evaluation methodologyirregular

Irregular research publications. No structured safety report. For an evaluation company, publishing aggregate data on AI failure patterns across evaluations would be highly valuable to the safety ecosystem.

Dual-Use Risk

Not applicable - this company does not develop dual-use AI systems.

Need a detailed report for Patronus AI?

Subscribe to express interest in indicator-level evidence, peer benchmarking, and regulatory gap analysis - or reach out to request a full company overview brief.