Black Hat USA 2026 lands at the Mandalay Bay Convention Center in Las Vegas from August 1 through 6, with Trainings running August 1–4, Briefings on August 5–6, and Arsenal — the open-source tool showcase that lives in the Business Hall — running August 4–6. If you only have one afternoon on the show floor, Arsenal is where I would spend it.
Arsenal is not a vendor pavilion. It is a demo area where researchers and maintainers stand next to a laptop and run their tool live while you ask hard questions. There are no sales engineers between you and the person who wrote the code. That format is the reason Arsenal consistently surfaces tools that matter twelve months before they show up in anyone's procurement deck.
This is a preview, written in June 2026. The schedule is not final, no tools have been demoed yet, and I am not going to invent winners or pretend to recap talks that have not happened. What I can do is tell you which categories are heating up based on what shipped at the 2025 events and the CFP signals, and how to separate a genuinely useful tool from a clever demo. Treat the specific tool names below as historical reference points from 2025, not predictions for 2026.
Agentic AI Security Is the Story This Year
If 2024 was about prompt injection against single-shot chatbots, the energy now is squarely on agentic AI — systems that plan, call tools, and act with minimal human approval. Arsenal Europe in late 2025 already leaned heavily into this, and the trend will only intensify in Las Vegas.
Two things drive it. First, the Model Context Protocol has become the default plumbing for connecting agents to tools and data, and that plumbing is a fresh, under-audited attack surface. At the 2025 USA Arsenal, a tool called AI Infrastructure Guard demonstrated agent-based scanning of MCP server source and remote MCP endpoints across categories including tool poisoning, remote code execution, and indirect prompt injection. Expect a wave of MCP-focused tooling in 2026 — scanners, proxies that sit between the agent and its tools, and runtime monitors that watch for an agent being steered off-task.
Second, LLM red-teaming is maturing from a research curiosity into a repeatable discipline. PromptFoo, shown at the 2025 Arsenal, framed itself around finding weaknesses in LLM systems, including jailbreaks and agentic AI failure modes. The category to watch in 2026 is automated adversarial testing for multi-step agents — not "can I jailbreak the model" but "can I poison one tool result and make the agent exfiltrate data three steps later." That is a meaningfully harder problem, and the tools that take it seriously are the ones to flag.
When you walk a demo here, ask one question: does the tool reason about the agent's full trajectory, or does it just score a single prompt-response pair? Single-turn scoring is a solved-ish problem. Trajectory-level analysis is where the real risk lives.
Software Supply Chain: Beyond the SBOM Checkbox
Supply-chain tooling has been an Arsenal staple for years — dependency confusion, code-signing abuse, SBOM gaps, secure-SDLC helpers. The bar has moved. Generating an SBOM is table stakes now; nobody gets a demo slot for that alone.
What is interesting in 2026 is the shift toward provenance and the AI supply chain specifically. Models, datasets, and fine-tuning pipelines are dependencies too, and most organizations have no inventory of them. The emerging acronym to know is AIBOM — an AI bill of materials that captures models, training data lineage, and the components wired around them. Watch for Arsenal tools that try to generate or verify AIBOMs, and be appropriately skeptical: a list of model files is not provenance. Real provenance means cryptographic attestation tying an artifact back to the build that produced it.
The other supply-chain thread is malicious-package detection on npm and PyPI. Typosquats and install-time scripts that phone home are a daily occurrence, and several 2025 tools attacked this with behavioral analysis rather than signature matching. The honest test for any of these: how does it handle a package that is benign at install time but pulls a malicious payload only under specific runtime conditions? Static analysis alone will miss that, and a good maintainer will tell you so plainly.
Defensive Tooling and the AI-Powered SOC
The blue-team side of Arsenal rarely gets the headlines, but it is often where the most immediately useful tools sit. The 2025 showcase included incident-response and forensics tooling — SigmaOptimizer, Hayabusa, and Suzaku among them — chosen in part for how they fold large language models into detection engineering and log triage.
The 2026 version of this is the LLM-assisted analyst. Tools that turn raw telemetry into Sigma rules, summarize an attack chain across thousands of events, or draft a detection from a threat-intel report. This is genuinely useful work and a sane place to apply an LLM, because a human analyst stays in the loop and the cost of a wrong suggestion is low.
The trap is the same one I flag every year: a tool that uses an LLM to generate detections without any verification will happily produce confident, plausible, and wrong Sigma rules. A rule that looks right but never fires is worse than no rule, because it creates a false sense of coverage. If the demo cannot show you how it validates a generated detection against real or replayed data, treat the output as a draft, not a control.
Offensive Tooling, OSINT, and the Reliable Middle
Offensive and OSINT tools remain the dependable core of Arsenal — novel exploitation techniques, EDR-evasion research and its countermeasures, reconnaissance utilities, and reverse-engineering helpers. These categories tend to produce the tools that quietly end up in everyone's kit a year later precisely because they are narrow, sharp, and do one thing well.
There will be AI-assisted offensive tooling too, and some of it will be impressive. Apply more skepticism here, not less. An LLM that "finds vulnerabilities" in a live demo is showing you its best case on code it may well have seen in training. Ask what its false-positive rate looks like on an unfamiliar codebase, and watch whether the presenter has a real answer or a deflection.
How to Vet an Arsenal Tool in Five Minutes
A practical filter for the floor. First, check the commit history and issue tracker on the spot — a tool with one author and a flurry of commits the week before the conference is a demo, not a project you can depend on. Second, ask what it gets wrong; an honest maintainer will tell you immediately, and a hand-wave is a red flag. Third, for anything AI-powered, separate the model from the system around it — the durable value is almost never the model itself but the verification, context, and orchestration wrapped around it. Tools that are all model and no scaffolding tend not to survive contact with a messy production environment.
How Safeguard Helps
Most of what is exciting at Arsenal is a strong component, not a finished control — a model, a scanner, a clever heuristic — and the gap between "impressive demo" and "thing you can trust in CI" is the verification and orchestration layer above it. That is exactly where Safeguard lives. Our Multi-Agent TAOR Deep Think AI Engine treats models as swappable components and runs multi-agent verification on top to cut false positives, while our AIBOM/ML-BOM, provenance and attestation, policy gates, and vendor scorecard turn raw findings into supply-chain decisions you can defend. We are model-agnostic by design: if a tool from the show floor is genuinely good, it can plug in as a component rather than become a new silo. If you want to pressure-test a tool you saw at Arsenal against your real dependency tree, reach out and we will run it side by side.