Free Datasets

Lovingly hand-labeled by the Surge AI team, for your wildest AI needs
Profanity Dataset
Need a list of profanities, and can't dream up enough on your own? We have you covered. Get the world's best profanity dataset for free now.
Toxicity Dataset
The world's largest dataset of social media toxicity — hateful speech across Twitter, Facebook, YouTube, Reddit, and more.
Hate Speech Dataset
A collection of hate speech from across the internet.
Crypto Sentiment Dataset
1000 Reddit comments about Crypto, labeled with Positive or Negative sentiment.
Credit Card Transactions Dataset
A collection of credit card transactions, classified by intent and financial category.
Brand Sentiment Dataset
Ditch NPS for good; understand real user sentiment with this dataset of 1000 labeled, online conversations.
Facebook Misinformation Dataset
A collection of Facebook posts containing misinformation.
Stock Sentiment Dataset
1000 stock market tweets, labeled with their sentiment towards a publicly traded stock.
Abortion Tweets Dataset
A collection of tweets, labeled with their stance on abortion and Roe v. Wade.

Other Resources

Manifold
Have you ever wondered how your data is shaped? Explore your datasets in their embedding space with our interactive visualizations.

Brought to you by Surge AI

The world's highest-quality data labeling platform. We unify sophisticated labelers with the powerful tools you need to build next-gen artificial intelligence and machine learning models.

Data Labeling for the
Richness of AI

Build human-powered datasets using our global labeling workforce and platform.