Edwin Chen
Edwin is the founder and CEO of Surge AI.
How Anthropic uses Surge AI to Train and Evaluate Claude
Edwin Chen
Learn how Anthropic partnered with Surge AI to gather high-quality human feedback at scale using the RLHF platform, resulting in one of the safest and most advanced large language models on the planet.
HellaSwag or HellaBad? 36% of this popular LLM benchmark contains errors
Edwin Chen
We analyzed HellaSwag, a popular LLM benchmark, and found errors in 36% of its rows.
30% of Google's Emotions Dataset is Mislabeled
Edwin Chen
Last year, Google released their “GoEmotions” dataset: a human-labeled dataset of 58K Reddit comments categorized according to 27 emotions. The problem? A whopping 30% of the dataset is mislabeled! Check out some of the egregious errors, and learn how to build better datasets.30% of Google's Emotions Dataset is Mislabeled
How Surge AI Built OpenAI's GSM8K Dataset of 8,500 Math Problems
Edwin Chen
We built a dataset of 8,500 Grade School Math Problems for OpenAI. The goal of the dataset: to train language models like GPT-3 to solve natural language math problems and measure their reasoning ability. Learn about our process in this blog post!