Customers
Use Cases
SFT
Rich Supervised Fine-Tuning
RLHF
Sophisticated Reinforcement Learning with Human Feedback
Human Evaluation
Custom Benchmarks
Latest Case Study
Read more
Other Use cases
Processed-Based Annotation
Audio RLHF
Multimodal SFT
STEM & Reasoning Post-Training
PhD Experts
Hundreds of Use Cases
Supervised Fine-Tuning
RLHF
Human Evaluation
Blog
About
Pricing
Team
Contact
Careers
Login
Get Started
Login
Get Started
Get Started
Home
/
Blog
/
Authors
/
Edwin Chen
Edwin Chen

Edwin Chen

Follow on Twitter
Follow on Facebook
Follow on LinkedIn
Data that Speaks for Itself

Data that Speaks for Itself

Edwin Chen
Edwin Chen
AI
DALL·E 3 and Midjourney Fail Astral Codex Ten's Image Generation Bet

DALL·E 3 and Midjourney Fail Astral Codex Ten's Image Generation Bet

Edwin Chen
Edwin Chen
AI
How RLHF Shifts LLMs from Autocompletion to Conversational Understanding

How RLHF Shifts LLMs from Autocompletion to Conversational Understanding

Edwin Chen
Edwin Chen
AI
Introduction to Reinforcement Learning with Human Feedback

Introduction to Reinforcement Learning with Human Feedback

Edwin Chen
Edwin Chen
Large Language Models
2022 Blog Recap: Trends in AI, Language, & Data

2022 Blog Recap: Trends in AI, Language, & Data

Edwin Chen
Edwin Chen
AI
We Evaluated ChatGPT vs. Google on 500 Search Queries

We Evaluated ChatGPT vs. Google on 500 Search Queries

Edwin Chen
Edwin Chen
Large Language Models
AI Red Teams for Adversarial Training: How to Make ChatGPT and LLMs Adversarially Robust

AI Red Teams for Adversarial Training: How to Make ChatGPT and LLMs Adversarially Robust

Edwin Chen
Edwin Chen
Large Language Models
HellaSwag or HellaBad? 36% of this popular LLM benchmark contains errors

HellaSwag or HellaBad? 36% of this popular LLM benchmark contains errors

Edwin Chen
Edwin Chen
Large Language Models
How TikTok is Evolving the Next Generation of Search

How TikTok is Evolving the Next Generation of Search

Edwin Chen
Edwin Chen
Social Media
Evaluating Generative AI: Did Astral Codex Ten Win His Bet on AI Progress?

Evaluating Generative AI: Did Astral Codex Ten Win His Bet on AI Progress?

Edwin Chen
Edwin Chen
AI
The $250K Inverse Scaling Prize and Human-AI Alignment

The $250K Inverse Scaling Prize and Human-AI Alignment

Edwin Chen
Edwin Chen
Large Language Models
Human Evaluation of Large Language Models: How Good is Hugging Face's BLOOM?

Human Evaluation of Large Language Models: How Good is Hugging Face's BLOOM?

Edwin Chen
Edwin Chen
Large Language Models
30% of Google's Emotions Dataset is Mislabeled

30% of Google's Emotions Dataset is Mislabeled

Edwin Chen
Edwin Chen
AI
Search Behind-the-Scenes: How Neeva Uses Human Evaluation to Measure Search Quality

Search Behind-the-Scenes: How Neeva Uses Human Evaluation to Measure Search Quality

Edwin Chen
Edwin Chen
Case Studies
Humans vs. Gary Marcus vs. Slate Star Codex: When is an AI failure actually a failure?

Humans vs. Gary Marcus vs. Slate Star Codex: When is an AI failure actually a failure?

Edwin Chen
Edwin Chen
AI
How Surge AI Built OpenAI's GSM8K Dataset of 8,500 Math Problems

How Surge AI Built OpenAI's GSM8K Dataset of 8,500 Math Problems

Edwin Chen
Edwin Chen
AI
10 Egregious Failures in Gmail Spam Detection

10 Egregious Failures in Gmail Spam Detection

Edwin Chen
Edwin Chen
Content Moderation
We asked 100 humans to draw the DALL·E prompts

We asked 100 humans to draw the DALL·E prompts

Edwin Chen
Edwin Chen
AI
Google Search is Falling Behind

Google Search is Falling Behind

Edwin Chen
Edwin Chen
Engineering
Holy $#!t: Are popular toxicity models simply profanity detectors?

Holy $#!t: Are popular toxicity models simply profanity detectors?

Edwin Chen
Edwin Chen
Content Moderation
Is Google Search Deteriorating? Measuring Google's Search Quality in 2022

Is Google Search Deteriorating? Measuring Google's Search Quality in 2022

Edwin Chen
Edwin Chen
Human Evaluation
The AI Bottleneck: High-Quality, Human-Powered Data

The AI Bottleneck: High-Quality, Human-Powered Data

Edwin Chen
Edwin Chen
Product
|
Powering the world’s LLMs

Welcome to
the world's largest RLHF platform

Get Started
Get Started
Subscribe
The latest in AI, language, and RLHF
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Case Studies
  • Adversarial Data Labeling
  • Content Moderation
  • Search Ranking
  • Reinforcement Learning with Human Feedback
  • Training Next-Gen Command LLM
Company
  • Home
  • Blog
  • About
  • Careers
  • Contact
Developers
  • Python SDK
  • API Documentation
  • Support
Platform
  • Pricing
  • Use Cases
  • Customers
2025 © Surge AI. All Rights Reserved
Sitemap
/
Terms of Service
/
Privacy Policy
Go to top