Is Elon right? We labeled 500 Twitter users to measure the amount of Spam

How would you measure the prevalence of Spam/Fake accounts on Twitter?

Imagine you're the CEO of Twitter, going through a wild few weeks. One night, at 2:44 AM, you wake up to hear that Elon Musk has paused the acquisition until he confirms that Spam represents <5% of users.

Nightmare or the cold sweat of reality?

https://twitter.com/elonmusk/status/1525049369552048129

I used to lead a Trust and Safety team at Airbnb, and many of our customers at Surge AI use us to train their content moderation systems. So how would you go about measuring Spam and seeing whether Elon's right?

First, let’s make our definitions precise: what are “users” and what constitutes a “spam/fake account”?

Users. Twitter defines monetizable daily users as users who login and access Twitter.com or any Twitter applications that show ads.
Spam/Fake accounts. Twitter has internal policies that define spam. While their general spam policies are shared publicly (see https://help.twitter.com/en/rules-and-policies/platform-manipulation), their more detailed, internal definitions are not.

So how could Twitter measure the amount of Spam on the platform? The standard approach:

Take a random sample of N users.
Ask a set of human evaluators to review each user, and label whether or not it’s a Spammy account. Labeling typically involves scrolling through tweets and engagements, and checking private signals (IP address, email, phone number, etc).
Let S be the number of users labeled as Spam. Then P = S/N is the point estimate of the percentage of Spammy users, and a 95% confidence interval is P +/- 1.96 * sqrt(P * (1 - P) / N).

For example, suppose we randomly sample 1000 users, and our human evaluators find that 22 of them are Spam. Then P = 22 / 1000 = 2.2%, and our 95% confidence interval is 1.3% -3.1%.

Subtleties

There are several subtleties involved in this calculation.

First, what are you trying to capture with your Spam policies? Suppose, for example, I create a Twitter account purely to market Taylor Swift’s latest album. Whenever I see a trending topic, I jump aboard and spam it in order to gain views:

Taylor Swift is the best! #red #taylorswift #fightforukraine #roevswade #joebiden #bitcoin #nfts #ethereum

I also jump into random users’ DMs shilling Taylor whenever I can:

Hey @sundarpichai, listen to Taylor!

Even though I’m a real human, should I be considered a Spam/Fake account?

What about the converse, where bots tweet the top news articles of the day, or simulate Big Ben bonging on the hour? These aren’t “real” people, but they’re not pretending to be either.

https://twitter.com/big_ben_clock/status/1526654050602831872

Do Twitter’s Spam definitions align with Elon's own?

Second, in order to measure Spam accurately, we need three things that external parties don’t have access to:

A way to randomly sample (monetizable daily) users
Twitter’s internal Spam/Fake Account policies
Private signals about each user

On #1: while it may be an interesting case study, randomly sampling 100 followers of @twitter won't be representative of Twitter's own numbers.

https://twitter.com/elonmusk/status/1525291586669531137

For example, perhaps all the bots – knowing Elon’s eyes are now on them – unfollow @twitter, in order to avoid suspension.

How do you even know which accounts are active on Twitter? Limiting to accounts with public activity will likely overestimate Spam since many legitimate users merely lurk, while bots serve no purpose until they publicly engage.

On #2: again, our own definitions of Spam may not align with Twitter’s. Are Swift stans and newspaper bots spammy? Whose definitions are right?

On #3: the importance of private signals is a trickier question. In theory, it’s true that a user may only appear Spammy until you have access to extra information. Perhaps that Nigerian prince scammer really is a Nigerian prince, once you see his email ends in nigeria.gov!

Conversely, a user may appear Real until you see that it shares the same IP address and phone number as 1,000 other accounts.

In practice, however, depending on the platform and the sophistication of spammers, it's possible to do a decent job of Spam detection based on public signals alone. Just look at your email Spam folder!

Two Case Studies

The billion-dollar question: what percentage of Twitter users are Spam/Fake? Given the caveats above, it’s impossible for us to measure this in a way that aligns with Twitter's methodology. As external users, we don’t have a way of randomly sampling monetizable daily active users, nor do we know what Twitter and Elon truly consider to be Spam.

That said, is it important to align? One of the reasons detecting Spam is important is because Twitter wants to provide a high-quality experience for users. If users see a low-quality, Spam-like account filling their Timelines, it’s still a bad experience (even if it doesn’t match Twitter’s own definitions, or if it’s a real person simply engaging in low-quality behavior).

Similarly, non-random samples can still update our priors. If we find that 90% of @twitter’s followers are Elon Musk impersonators actively claiming to give away free bitcoin, perhaps we should investigate further…

With these caveats in mind, we ran two small studies:

We sampled 250 recent followers of the @twitter account, and asked our Surger spam moderation team to review them.
We sampled 250 active tweeters in a particular hour, and also labeled them for spam.

What did we find?

Under method #1 (sampling passive, recent @twitter followers), we found 6 accounts that seemed Spam-like, for a 95% confidence interval estimate of 0.5% - 2.4% Spammy users.
Under sampling method #2 (active tweeters), we found 5 accounts that seemed Spam-like, for a 95% confidence interval of 0.2 - 3.8% Spammy users.

What were some of the Spam-like accounts that we detected?

Example #1

One account we found was promoting and engaging on financial websites – likely paid to do so.

This account tweets about Online Check Writer once on Oct 30, twice on Oct 28, and once on Jul 29

For example, let’s look at this tweet that they favorited:

He also favorites Online Check Writer on Jul 29, despite not following the account

The comments on that tweet all seem quite spam-like:

Many of the favoriters follow a common username pattern of their name followed by a series of numbers:

What else are those favoriters also liking?

Rimi, the first account on that list, really likes Markus Wischenbart and Online Check Writer! Here are her Likes:

Coincidentally, Joseph, the second favoriter in the list (and whose profile photo looks like a deepfake – check out the ears and perfectly straight profile!) also really likes Markus Wischenbart and Online Check Writer…

Joseph ❤️ Markus Wischenbart and Online Check Writer too

Guess what Nikhil, the third favoriter, also likes?

Who knew **Online Check Writer** and **Markus Wischenbart** would have so many common fans?

You get the picture.

Example #2

This is another example we found – which Twitter appears to have already detected!

This account is repeatedly tweeting a stock image template.

In short: Spam exists on Twitter. And some of it may not be so hard to find! But, overall – while our own estimates may not be representative – they do align with Twitter's own.

Need help labeling Spammy users, to train and measure your ML algorithms? We work with the world's largest companies to keep their platforms safe.

If you're interested in a free Spam dataset and seeing more examples of Spammy Twitter users, sign up for early access here!

Edwin Chen

Edwin oversees Surge AI's Engineering and Research teams — whether it's helping customers train large language models on human feedback, building content moderation algorithms to detect hate speech and spam, or scaling up an elite data labeling workforce. He previously led AI, Data Science, and Human Computation teams at Google, Facebook, and Twitter, and studied mathematics and linguistics at MIT.

Data Labeling 2.0 for Rich, Creative AI

Superintelligent AI, meet your human teachers. Our data labeling platform is designed from the ground up to train the next generation of AI — whether it’s systems that can code in Python, summarize poetry, or detect the subtleties of toxic speech. Use our powerful data labeling workforce and tools to build the rich, human-powered datasets you need today.

Is Elon right? We labeled 500 Twitter users to measure the amount of Spam

Subtleties