Scorecard/Haize Labs

Haize Labs

Data gathering in process

Automated AI red-teaming and safety testing platform. Created RiskRubric.ai leaderboard and Cascade multi-turn testing technology.

HQUS
Est2023
Size11-50
EU AI ActLimited Risk
haizelabs.com
Score
56.7 / 100
Evidence
6 items

Strong safety posture with established governance frameworks and active risk management.

Strengths:Technical Safety, Risk Assessment, External Engagement
Weaknesses:Governance Maturity, Regulatory Readiness
Focus Areas
red teamingai safety toolingevaluationllm security

Security Assessment

Security-relevant indicators for vendor evaluation

Security Posture
66
TS-01dim: 70
Red Teaming & Pre-deployment Testing
Adversarial testing before deployment
TS-05dim: 70
Robustness & Adversarial Resilience
Resistance to adversarial attacks
RA-01dim: 62
Sector-Specific Risk Assessment
Risk analysis for deployment context
RA-03dim: 62
Dual-Use & Misuse Risk
Dangerous capability awareness
RA-07dim: 62
Incident History & Track Record
Past incidents and response quality
EE-04dim: 68
Vulnerability Disclosure Program
Bug bounty or CVE reporting process
Incident History
Haize Labs incident records sourced from AIAAIC Repository and public reporting.
Integration: AIAAIC, OECD AI Incidents Monitor
Third-Party Audits
External audit reports, SOC 2 attestations, and ISO certifications verified where published.
Sources: Company filings, registry lookups
CVE & Disclosures
Known vulnerabilities and security advisories from NVD, GitHub Security Advisories, and vendor pages.
Sources: NVD, GHSA, vendor disclosure pages

Dimension Breakdown

GM
Governance Maturitymedium
Published policies, corporate structure, safety mandate, whistleblowing, executive commitment.
48
1 evidence items
GM-01
TS
Technical Safetymedium
Benchmarks, adversarial robustness, fine-tuning safety, watermarking, model cards, research output.
70
2 evidence items
TS-01TS-02
RA
Risk Assessmentlow
Dangerous capability evaluations, thresholds, external testing, bug bounty, halt conditions.
62
1 evidence items
RA-01
RR
Regulatory Readinesslow
ISO 42001, EU AI Act compliance, GPAI obligations, international commitments, incident reporting.
40
EE
External Engagementmedium
Survey participation, research support, transparency, behavior specs, open-source contributions.
68
2 evidence items
EE-01EE-02

Social Impact & Safety Profile

Moderate

Haize Labs specialises in automated red-teaming and AI safety evaluation. Their tooling helps organisations discover harmful model behaviors before deployment. While social impact is a natural outcome of their product, explicit social impact policies and measurable commitments are still developing.

red teamingharm preventionsafety evaluation
Why it matters for safety

Manual red-teaming cannot scale. As models are updated and deployed in new contexts, continuous automated testing is required to maintain safety assurance. Haize Labs addresses the scale gap in safety testing.

Civilizational Risk Awareness

1/3

Generic safety language. Commercial red-teaming motivation without explicit acknowledgment of catastrophic risk dimensions.

Responsible Scaling Policy

None

No RSP. As a testing company, the equivalent question is responsible handling of discovered vulnerabilities and access control on testing capabilities.

Mission Drift Protection

0/3
  • No PBC status
  • No structural governance mechanisms
  • No safety-specific mission commitment
  • Company positioning is commercial red-teaming, not safety-first

Vulnerability Disclosure

None

No formal CVD programme. Similar gap to Gray Swan - a red-teaming company needs strong vulnerability handling practices.

Safety Reporting

- None
Research publicationsirregular

No structured safety reporting. Published research exists but no regular transparency or safety assessment cadence.

Dual-Use Risk

ModerateAI×Cyber Offensive

Moderate dual-use risk inherent to automated red-teaming. Enterprise focus provides soft controls but no formal dual-use governance.

Mitigation details
Enterprise customer focus limits distribution
Attack techniques based on published research
No formal dual-use assessment
No published access control policy for testing capabilities

Need a detailed report for Haize Labs?

Subscribe to express interest in indicator-level evidence, peer benchmarking, and regulatory gap analysis - or reach out to request a full company overview brief.