10 Egregious Failures in Gmail Spam Detection

Edwin Chen
May 24, 2022
10 Egregious Failures in Gmail Spam Detection

Are the Spammers winning? Lately, it seems that more and more spam is passing Gmail's filters... 

Ask HN: Has Gmail spam blocking taken a sudden nosedive?

Even in Search, the rise of spammy SEO content might be one of the reasons the quality of Google Search is falling.

Spam is a very human problem, where spammers are constantly adapting social engineering strategies to trick their victims. So what kind of Spam is being missed? We asked Surgers, the data labelers on our platform, to collect examples whenever they come across spammy messages that Gmail doesn’t detect.

Here are 10 egregious Gmail Spam failures they gathered.

Gmail Fail #1

“This is an obviously spammy email, offering a free flashlight (lol), with a link to retrieve it. It also uses very random characters in the

  • Subject line: ///Y0urFREEFlashlight(NeedYourAddress)///1851
  • Username: F l a s h l i g h t’
  • Sender: u8zjwFb-lyZSC3-noReply@gwhsi.lairpro.com

WTF this wasn’t caught!”

Gmail Fail #2

“This is obvious spam for a few reasons:

1. It appears to be a dating site that I've never signed up for or heard of (I certainly don't know any "Sylvia").

2. The poor English ("Hej ❤️ Sylvia want to meet you") and sketchy link.

3. It's in Swedish!”

Gmail Fail #3

“The email is obviously not from Best Buy. It’s from some weird email uBPrL1A-1DubrR-noReply@ozm40.lairpro.com with a bogus name of _Congrat_’ and a subject line of MessageF0rY0u7294, saying “You’ve been selecte.d85920”.

Pretty obvious it's spam, I’m not sure why Gmail couldn’t catch this. Do the weird, obfuscated characters actually fool its filters?”

Gmail Fail #4

“This is very obvious spam. It’s a clickbait link, sent from a low-quality Hotmail account, with a sketchy name (“ArthritisDiet”).”

Gmail Fail #5

“I’ve certainly never been to such a shop, so the email is unwanted, and I can’t find any information on a shop of this name at these addresses. What’s more, there are THREE DIFFERENT addresses in the email that seem to be virtual mailboxes.

The subject line is ForYourDreamBathroom without any spaces. (A lot of these spam messages seem to do that. Why?)”

Gmail Fail #6

“This was sent to many people, from a suspicious-looking Hotmail address. The subject and user’s name are also missing spacing, which is a good sign of Spam.”

Gmail Fail #7

“This is a sex therapist app that I’ve never signed up for. Pretty angry this couldn’t be caught!”

Gmail Fail #8

“This is a dating webpage, written in Russian. I don’t speak Russian.”

Gmail Fail #9

“This is an email impersonating Lowe’s pretending to give away high value gift cards for for completing a short survey.

See also the obviously spammy “-*Lowe’s*-” username and “confirmation_Receipt!.” subject line.”

Gmail Fail #10

“This is clear Spam. The email sender’s name is “Home” (I see this a lot in Spam emails, I’m not sure why they don’t pick better names). It’s from an ordinary, non-company gmail account. The English is completely trash, with random capitalization and capitalization. Even the way there are multiple empty lines between “valid feedback!” and “Take the survey” seems like a spam indicator!”

In short, a lot of these Spam emails seem quite detectable. Witness the

  • Odd characters (-*Lowe’s*-)
  • Random spacing (///Y0urFREEFlashlight(NeedYourAddress)///1851)
  • Sketchy addresses (uBPrL1A-1DubrR-noReply@ozm40.lairpro.com)
  • Low-quality names (ArthritisDiet)
  • And clear impersonation of companies like CVS and Lowe's (free gift cards sent form ordinary Hotmail accounts).

Are the spammers winning? What do you think is going wrong?

Have you experienced frustrations getting good data to train your Spam models? Want to work with a data labeling platform that treats data as a first-class citizen, and gives it the loving attention and care it deserves? Check out our other posts on datasets and spam, and follow us on Twitter at @HelloSurgeAI!

Edwin Chen

Edwin Chen

Edwin oversees Surge AI's Engineering and Research teams — whether it's helping customers train large language models on human feedback, building content moderation algorithms to detect hate speech and spam, or scaling up an elite data labeling workforce. He previously led AI, Data Science, and Human Computation teams at Google, Facebook, and Twitter, and studied mathematics and linguistics at MIT.

surge ai logo

Data Labeling 2.0 for Rich, Creative AI

Superintelligent AI, meet your human teachers. Our data labeling platform is designed from the ground up to train the next generation of AI — whether it’s systems that can code in Python, summarize poetry, or detect the subtleties of toxic speech. Use our powerful data labeling workforce and tools to build the rich, human-powered datasets you need today.

Meet the world's largest
RLHF platform

Follow Surge AI!