Resources

Supply Chain Security, in plain English.

Deep dives, practical guides, and incident analyses from engineers who build Safeguard. No fluff, no vendor FUD — just what you need to ship secure software.

Filtering by tag:#benchmarks19 articles

Filter

All (19)AI Security (294)DevSecOps (153)Open Source Security (132)Best Practices (126)Vulnerability Analysis (98)Incident Analysis (83)Industry Analysis (80)Application Security (73)Compliance (68)Container Security (64)Software Supply Chain Security (51)Vulnerability Management (47)Regulatory Compliance (42)Threat Intelligence (41)Supply Chain Attacks (36)Product (35)Cloud Security (35)SBOM (34)Supply Chain Security (25)Ransomware (21)Infrastructure Security (20)SBOM & Compliance (19)Industry Guides (19)Compliance & Regulations (18)Emerging Technology (17)Case Studies (17)Risk Management (16)Tool Reviews (16)Incident Response (15)Security Strategy (13)Dependency Security (11)Web Security (11)Kubernetes Security (9)Company (8)Architecture (8)Industry Trends (7)Secure Development (7)AppSec (7)How-To Guide (7)Zero-Day Exploits (7)Network Security (7)Dependency Management (7)Data Breach (7)Research (6)Tutorials (6)Security Operations (6)Organizational Security (6)Developer Security (6)Open Source (5)Breach Analysis (5)Code Security (5)Product Launch (4)Offensive Security (4)Tool Comparisons (4)Build Security (3)Vulnerability Research (3)Compliance & Frameworks (3)Regional Security (3)Policy & Compliance (3)SBOM Standards (3)Software Supply Chain (3)Analysis (3)Startup Security (3)Mobile Security (3)Hardware Security (3)Security (2)Zero-Day Analysis (2)Industry News (2)Release (2)SBOM and Compliance (2)Security Management (2)Threat Actors (2)API Security (2)Security Architecture (2)Security Culture (2)Social Engineering (2)DeFi Security (2)Cryptocurrency Security (2)Technical (1)Healthcare (1)Events (1)Frameworks (1)Product Update (1)Standards (1)Engineering (1)Language Security (1)Emerging Threats (1)Privacy (1)Lifecycle Management (1)Career Development (1)Tools & Platforms (1)Threat Modeling (1)Browser Security (1)Threat Analysis (1)Business Continuity (1)Runtime Security (1)Governance (1)Healthcare Security (1)Credential Attacks (1)Identity Security (1)PKI Security (1)Architecture Security (1)Nation-State Threats (1)Tools & Techniques (1)Privacy & Security (1)

Articles

RSS feed

AI Security

Refusal Rate Analysis: Griffin AI vs Mythos

A security AI that refuses too often is useless. One that refuses too rarely is dangerous. Griffin AI publishes calibrated refusal benchmarks; Mythos does not.

Feb 4, 20267 min read

AI Security

SEvenLLM Design And Coverage

SEvenLLM set out to measure how well LLMs handle Security Event analysis, the unglamorous day-to-day work of SOCs and IR teams. A design review of what the benchmark covers, how it was built, and where the coverage maps or does not map to real operations.

Feb 2, 20266 min read

AI Security

Citation Accuracy: Griffin AI vs Mythos

An AI security tool that cites the wrong advisory is worse than one that says nothing. Griffin AI benchmarks citation accuracy at 0.89 similarity; Mythos does not.

Jan 28, 20267 min read

AI Security

SecBench Methodology Reviewed

SecBench positioned itself as a comprehensive cybersecurity knowledge and reasoning benchmark for LLMs. A methodology review of its construction, scoring, and the gaps that separate the advertised coverage from what the benchmark actually exercises.

Jan 25, 20267 min read

AI Security

Adversarial Resistance: Griffin AI vs Mythos

Griffin AI reports 98-100% hold rate against adversarial probes. Most Mythos-class tools have never published an adversarial number at all.

Jan 21, 20267 min read

AI Security

AI Safety Eval Datasets as Supply Chain

The datasets you use to evaluate model safety are themselves a supply chain, and almost nobody is treating them that way. A senior engineer's audit of how eval corpora get poisoned, contaminated, and silently drifted.

Jan 18, 20267 min read

AI Security

SWE-Bench With Security Extensions: Field Review

SWE-bench became the default benchmark for measuring AI coding agents, but the security extensions that were bolted on afterwards deserve their own scrutiny. A field review of what they measure, where they break, and whether you should trust the numbers.

Jan 17, 20266 min read

AI Security

Eval Methodology: Griffin AI vs Mythos

A benchmark number is only as good as the methodology that produced it. Here is how Griffin AI builds its harness and why most Mythos-class tools cannot be audited.

Jan 13, 20267 min read

AI Security

CyberSecEval Reviewed: What It Measures

A working engineer's review of CyberSecEval, the Meta-originated benchmark that has quietly become the default sniff test for AI-for-security claims. What it actually measures, what it misses, and how to read its scores without fooling yourself.

Jan 9, 20266 min read

Page 2 of 3

Stay informed

Weekly insights on software supply chain security, delivered to your inbox.

Blog | Safeguard.sh — Software Supply Chain Security Insights

Supply Chain Security, in plain English.

Articles

Refusal Rate Analysis: Griffin AI vs Mythos

SEvenLLM Design And Coverage

Citation Accuracy: Griffin AI vs Mythos

SecBench Methodology Reviewed

Adversarial Resistance: Griffin AI vs Mythos

AI Safety Eval Datasets as Supply Chain

SWE-Bench With Security Extensions: Field Review

Eval Methodology: Griffin AI vs Mythos

CyberSecEval Reviewed: What It Measures

Stay informed

Product

Solutions

Compare

Resources

Company

Legal

Developers