New Data Labeling Tools for Training Chatbots and Conversational Assistants

Andrew Mauboussin
Jun 8, 2022
New Data Labeling Tools for Training Chatbots and Conversational Assistants

Recent ML breakthroughs have led to a new wave of powerful chatbots — conversational AIs that can create data visualizations and even write code.

A major source of their intelligence? Sophisticated human feedback. Research from top AI companies like Anthropic and OpenAI has shown that training models on human feedback instead of static data alone – i.e., having humans chat with these bots and teach them when their responses are good or bad – improves their performance on NLP benchmarks across the board.

Unfortunately, the standard workflow for collecting human feedback data is painful and complex. If you’re starting from scratch, you'll need to:

  1. Create a custom web interface to label chatbot responses.
  2. Hire a team of contractors to perform the labeling.
  3. Build infrastructure to measure throughput and label quality.
  4. Remove and retrain contractors when they do a poor job.

That’s why language model companies around the world turn to us for their human feedback and data labeling needs, and we've been partnering with them to build new conversational labeling interfaces.

Today, we’re releasing these chatbot labeling tools so that you can use them too. These tools integrate directly with our data labeling workforce, so that you can hook up your conversational models, and Surgers can begin chatting and labeling their replies within seconds.

Collecting Human Feedback on the Surge AI Platform

We support two different workflows for collecting human feedback to train your chatbots.

1. Live chat/annotation. Surgers talk to your chatbot and label responses live.

If you want Surgers to interact with your chatbot live, connect it to our labeling platform by providing an API endpoint. Surgers can then talk to your model, rate its responses, reply, and continue rating some more. Our customers use this functionality to measure the quality of new models, and create training data to fine-tune them.

Chatting with a bot in our labeling tools
Chatting with a bot in Surge AI's conversational labeling interface

2. Asynchronous chat/annotation. Surgers label transcripts from previous dialog collections.

If you don't want Surgers to interact with your chatbot and create new conversations, you can also upload existing dialogs to be labeled. Surgers can read through the transcript turn-by-turn, and label them with the features you need.

Labeling uploaded conversations
Labeling uploaded conversations in Surge AI's chatbot tools

If you’re interested in trying our chatbot and conversation tools, sign up for the Surge platform here! You can also reach out to us at if you have any questions or want help with your use case.

Andrew Mauboussin

Andrew Mauboussin

Andrew oversees Surge AI's Engineering and Machine Learning teams. He previously led Twitter's Spam and Integrity efforts, and studied Computer Science at Harvard.

surge ai logo

Data Labeling 2.0 for Rich, Creative AI

Superintelligent AI, meet your human teachers. Our data labeling platform is designed from the ground up to train the next generation of AI — whether it’s systems that can code in Python, summarize poetry, or detect the subtleties of toxic speech. Use our powerful data labeling workforce and tools to build the rich, human-powered datasets you need today.

Meet the world's largest
RLHF platform

Follow Surge AI!