In July 2022, a threat actor using the handle "devil" posted on a hacking forum advertising a dataset of 5.4 million Twitter user records for sale at $30,000. The data linked Twitter accounts to phone numbers and email addresses — information that could be used to de-anonymize users, including those operating pseudonymous accounts in high-risk environments.
The vulnerability that enabled this data collection had been reported to Twitter through its bug bounty program in January 2022 and patched. But by then, at least one attacker had already exploited it at scale.
The Vulnerability
The flaw existed in an API endpoint used during the login flow. When a user attempted to log in or reset their password, they could submit a phone number or email address. Twitter's API would respond differently depending on whether that identifier was associated with a Twitter account — effectively confirming or denying the existence of an account linked to that identifier.
This is an enumeration vulnerability: it allows an attacker to systematically query the API with lists of phone numbers or email addresses and determine which ones have associated Twitter accounts.
The vulnerability was present in a June 2021 code update and was reported via HackerOne in January 2022. Twitter patched it and paid a bug bounty. However, the attacker had already been systematically querying the API with millions of phone numbers and email addresses, building a database that mapped identifiers to accounts.
What Was Exposed
The 5.4 million record dataset included:
- Twitter user IDs
- Names
- Screen names (handles)
- Locations
- Verified status
- Phone numbers (where associated with accounts)
- Email addresses (where associated with accounts)
In January 2023, a separate, much larger dataset surfaced containing approximately 200 million Twitter email-account associations. While this larger dataset did not include phone numbers, the sheer volume — covering a significant fraction of Twitter's user base — made it one of the most comprehensive social media data leaks to date.
The De-Anonymization Risk
The most dangerous aspect of this breach was its potential to de-anonymize pseudonymous Twitter accounts. Many Twitter users — journalists, activists, whistleblowers, political dissidents — operate under pseudonyms specifically to protect their identity.
The API vulnerability allowed attackers to:
- Start with a known phone number or email address
- Determine the Twitter account associated with that identifier
- Link a real-world identity to a pseudonymous online presence
For a journalist operating a pseudonymous account to report on government corruption, this de-anonymization could be life-threatening. For a whistleblower using a burner Twitter account to share information, it could expose them to retaliation.
The reverse lookup was also possible:
- Start with a known Twitter account
- Try various phone numbers or emails until one matches
- Discover the real-world identity behind the account
Twitter's Response Failures
Twitter's handling of this breach drew criticism on multiple fronts:
Delayed disclosure. The vulnerability was patched in January 2022, but Twitter did not publicly disclose the breach until August 2022, after the data appeared for sale on hacking forums. Users whose data had been scraped were not proactively notified.
Incomplete assessment. Twitter initially focused on the 5.4 million record dataset. The existence of the larger 200+ million record dataset was not acknowledged until it surfaced publicly in January 2023.
No individual notification. Twitter did not directly notify the 5.4 million affected users. The disclosure came through a blog post and press coverage.
Ongoing platform chaos. The breach disclosure occurred during a period of significant internal turmoil at Twitter, with the Elon Musk acquisition creating uncertainty about the company's direction, staffing, and security posture. Reports of mass departures from Twitter's security team further eroded confidence in the platform's ability to protect user data.
The API Enumeration Pattern
Twitter's vulnerability is part of a broader pattern of API enumeration flaws that have affected major platforms:
- Facebook (2019-2021) — Contact import API exploited to scrape 533 million user records
- LinkedIn (2021) — API enumeration used to scrape 700 million user profiles
- Clubhouse (2021) — API enumeration exposed user data
- Twitter (2021-2022) — Login API enumeration linked identifiers to accounts
The pattern is consistent: platforms build APIs that respond differently based on whether a queried identifier exists in their system. Attackers automate queries against these APIs to build comprehensive databases.
Preventing API enumeration requires:
- Consistent responses regardless of whether an identifier is found (the API should respond identically for valid and invalid identifiers)
- Aggressive rate limiting on any endpoint that accepts user identifiers
- CAPTCHA or proof-of-work challenges that scale with request volume
- Monitoring for bulk enumeration patterns even within rate limits
How Safeguard.sh Helps
Safeguard.sh helps organizations identify and monitor API endpoints that are vulnerable to enumeration attacks — the kind of vulnerability that enabled the Twitter data exposure. Our platform analyzes your attack surface, tracks which APIs accept user identifiers, and enforces security policies requiring consistent API responses and rate limiting. For organizations whose employees or customers use platforms like Twitter, Safeguard.sh monitors for data exposure in known breach datasets, alerting you when your people's information surfaces so you can take protective action before attackers use it for phishing or de-anonymization.