Last year, Google released their “GoEmotions” dataset: a human-labeled dataset of 58K Reddit comments categorized according to 27 emotions. The problem? A whopping 30% of the dataset is mislabeled! Check out some of the egregious errors, and learn how to build better datasets.
The latest in AI, language, and data labeling. Subscribe below!