Build state-of-the-art large language models, in the style of InstructGPT and ChatGPT.
Build state-of-the-art AI by training your large language models on human feedback.
A dataset of online comments in Japanese that contain hate speech, insults, and toxicity.
This dataset contains search queries, as well as the user's intent when performing the search query.
This Google Search Quality dataset contains search queries, intents, result URLs, and a human-evaluated rating.
This search evaluation dataset contains search queries, the intent behind each search query, result URLs, and a human-evaluated search quality rating.
1000+ tweets, classified by sentiment.
A dataset of real Spam and Not Spam emails, including whether or not they were caught by Gmail's spam filters.
A dataset of social media posts containing fake news.
A dataset of resumes, classified with job title, category, and more.
A dataset of financial transactions, classified by intent and financial category.
1000+ customer reviews, social media posts, and more, classified by sentiment.
A dataset of Arabic hate speech texts.
A collection of Spanish hate speech texts
A dataset of thousands of Japanese profanities, insults, and curse words, so that you can keep your platform safe.
A dataset of thousands of Arabic profanities, insults, and curse words, so that you can keep your platform safe.
A dataset of thousands of German profanities, insults, and curse words, so that you can keep your platform safe.
A dataset of thousands of French profanities, insults, and curse words, so that you can keep your platform safe.
A dataset of thousands of Spanish profanities, insults, and curse words, so that you can keep your platform safe.
A collection of hate speech posts on Facebook.
A collection of hate speech tweets on Twitter.
A collection of tweets, labeled with their stance on abortion and Roe v. Wade.
A dataset of hate speech from across the Internet.
A collection of credit card transactions, classified by intent and financial category.
A dataset of Facebook posts containing misinformation.
A dataset of questions about real webpages, news articles, and pieces of text, along with their associated answers.
Ditch NPS for good; understand real user sentiment with this dataset of 1000 labeled, online conversations.
1000 Reddit comments about Crypto, labeled with Positive or Negative sentiment.
1000 stock market tweets, labeled with their sentiment towards a publicly traded stock.
The world's largest dataset of social media toxicity — hateful speech across Twitter, Facebook, YouTube, Reddit, and more.
Need a list of profanities, and can't dream up enough on your own? We have you covered. Get the world's best profanity dataset for free now.
Build state-of-the-art large language models, in the style of InstructGPT and ChatGPT.
Build state-of-the-art AI by training your large language models on human feedback.
A dataset of online comments in Japanese that contain hate speech, insults, and toxicity.
This dataset contains search queries, as well as the user's intent when performing the search query.
This Google Search Quality dataset contains search queries, intents, result URLs, and a human-evaluated rating.
This search evaluation dataset contains search queries, the intent behind each search query, result URLs, and a human-evaluated search quality rating.
1000+ tweets, classified by sentiment.
A dataset of real Spam and Not Spam emails, including whether or not they were caught by Gmail's spam filters.
A dataset of social media posts containing fake news.
A dataset of resumes, classified with job title, category, and more.
A dataset of financial transactions, classified by intent and financial category.
1000+ customer reviews, social media posts, and more, classified by sentiment.
A dataset of Arabic hate speech texts.
A collection of Spanish hate speech texts
A dataset of thousands of Japanese profanities, insults, and curse words, so that you can keep your platform safe.
A dataset of thousands of Arabic profanities, insults, and curse words, so that you can keep your platform safe.
A dataset of thousands of German profanities, insults, and curse words, so that you can keep your platform safe.
A dataset of thousands of French profanities, insults, and curse words, so that you can keep your platform safe.
A dataset of thousands of Spanish profanities, insults, and curse words, so that you can keep your platform safe.
A collection of hate speech posts on Facebook.
A collection of hate speech tweets on Twitter.
A collection of tweets, labeled with their stance on abortion and Roe v. Wade.
A dataset of hate speech from across the Internet.
A collection of credit card transactions, classified by intent and financial category.
A dataset of Facebook posts containing misinformation.
A dataset of questions about real webpages, news articles, and pieces of text, along with their associated answers.
Ditch NPS for good; understand real user sentiment with this dataset of 1000 labeled, online conversations.
1000 Reddit comments about Crypto, labeled with Positive or Negative sentiment.
1000 stock market tweets, labeled with their sentiment towards a publicly traded stock.
The world's largest dataset of social media toxicity — hateful speech across Twitter, Facebook, YouTube, Reddit, and more.
Need a list of profanities, and can't dream up enough on your own? We have you covered. Get the world's best profanity dataset for free now.
Build state-of-the-art large language models, in the style of InstructGPT and ChatGPT.
Build state-of-the-art AI by training your large language models on human feedback.
A dataset of online comments in Japanese that contain hate speech, insults, and toxicity.
This dataset contains search queries, as well as the user's intent when performing the search query.
This Google Search Quality dataset contains search queries, intents, result URLs, and a human-evaluated rating.
This search evaluation dataset contains search queries, the intent behind each search query, result URLs, and a human-evaluated search quality rating.
1000+ tweets, classified by sentiment.
A dataset of real Spam and Not Spam emails, including whether or not they were caught by Gmail's spam filters.
A dataset of social media posts containing fake news.
A dataset of resumes, classified with job title, category, and more.
A dataset of financial transactions, classified by intent and financial category.
1000+ customer reviews, social media posts, and more, classified by sentiment.
A dataset of Arabic hate speech texts.
A collection of Spanish hate speech texts
A dataset of thousands of Japanese profanities, insults, and curse words, so that you can keep your platform safe.
A dataset of thousands of Arabic profanities, insults, and curse words, so that you can keep your platform safe.
A dataset of thousands of German profanities, insults, and curse words, so that you can keep your platform safe.
A dataset of thousands of French profanities, insults, and curse words, so that you can keep your platform safe.
A dataset of thousands of Spanish profanities, insults, and curse words, so that you can keep your platform safe.
A collection of hate speech posts on Facebook.
A collection of hate speech tweets on Twitter.
A collection of tweets, labeled with their stance on abortion and Roe v. Wade.
A dataset of hate speech from across the Internet.
A collection of credit card transactions, classified by intent and financial category.
A dataset of Facebook posts containing misinformation.
A dataset of questions about real webpages, news articles, and pieces of text, along with their associated answers.
Ditch NPS for good; understand real user sentiment with this dataset of 1000 labeled, online conversations.
1000 Reddit comments about Crypto, labeled with Positive or Negative sentiment.
1000 stock market tweets, labeled with their sentiment towards a publicly traded stock.
The world's largest dataset of social media toxicity — hateful speech across Twitter, Facebook, YouTube, Reddit, and more.
Need a list of profanities, and can't dream up enough on your own? We have you covered. Get the world's best profanity dataset for free now.
The world's highest-quality data labeling platform. We unify sophisticated labelers with the powerful tools you need to build next-gen artificial intelligence and machine learning models. Learn about some of the common pitfalls in data labeling we avoid to bring you the best data possible.