Case Study: Content Moderation for a Leading Social Media Company

Case Study: Content Moderation for a Leading Social Media Company

One of the world’s largest social media platforms needed to improve their ML models for filtering hateful speech, misinformation, and spam. They were inundated with data from their platform on a daily basis and needed a data labeling solution that could:

  • Generate millions of nuanced judgements per month across multiple domains — hateful speech, misinformation, and spam.
  • Provide a workforce of high-skilled labelers with a deep understanding of cultural norms and current events in a range of locales.
  • Provide labelers fluent in English, Portuguese, Spanish, French, German, Italian, Japanese, Arabic, Turkish, and Mandarin.

With Surge AI data, the customer tripled the quality of their datasets, sped up their data pipelines by 10x, and improved the AUC of their models by 55%. Surge delivered 50 million labels over the past year, at higher quality than professional fact checkers.

Data Labeling Vendor Selection

After conducting a vendor evaluation across 10 solutions, the customer selected Surge AI as their data labeling partner. The choice was easy — the customer’s evaluation process revealed that Surge produced 62% higher accuracy. Surge AI improved quality and efficiency while reducing cost and operational overhead.

Surge AI’s Solution

To meet this customer’s needs, we built custom labeling teams, an intuitive labeling workflow, and dedicated a project manager to oversee the project.

Dedicated Data Labeling Teams

  • The customer had unique and nuanced criteria for assessing toxicity, misinformation, and spam, so we created custom labeling teams of Surgers exceptionally well-suited to their task. Surgers had to score a 98% or higher on our training exams to qualify for the customer’s projects.
  • Across each domain and language, we created 28 custom labeling teams in total. These teams continued to grow over time, and delivered 1M+ labels per week.  

Flexible Data Labeling Workflow

  • Our work with this customer spanned multiple teams, each with their own operational style — fortunately, Surge offers both a drag and drop template creator perfect for non-technical employees, and a fully-featured API designed for ML teams.
  • Our labeling interface allowed us to meet the customer’s specific data collection requirements, including: multiple choice questions, checkboxes, free response, NER tagging, file upload, and conditional logic.
  • Our custom widgets enable content (in this case, social media posts) to be embedded on our platform, which improved Surger accuracy, efficiency, and output.

Dedicated Project Manager

  • To ensure success for our customer, we assigned them a dedicated project manager from our team. We met with the customer weekly to review edge cases and collaboratively improve the quality of data they received.
  • The project manager was responsible for communicating with both the customer and the Surgers working on the project — creating a two-way communication channel for questions, feedback, and iteration.
  • The project manager also monitored data quality in real-time, providing an additional layer of quality assurance before the customer received data.
  • The customer often remarked that due to their hands-on involvement, our project manager understood the nuance of the projects better than the customer themselves.


After switching to Surge AI, the customer doubled the quality of their datasets, as measured by precision, recall, and F1 score on internal golden sets. As a result, they were able to boost the AUC of their ML models by 55%. In a world where even a 1-2% AUC improvement is celebrated, these results are monumental, and a vivid example of how fundamental data quality is to model performance.

As a further testament to the data quality that Surge produces, the customer told us that our labelers identified misinformation more effectively than professional fact checkers.

In the last year, we’ve delivered over 50 million, high quality labels to the customer across hateful speech, misinformation, spam, and a variety of AI and NLP use cases.

Want to Chat?

Need high quality data labeling? Let’s chat.

Jefferson Lee

Jefferson Lee

Jefferson leads Surge AI's data labeling and NLP products — whether it's helping customers label their large language models, gather data to train Spam and Hate Speech classifiers, or run large-scale search evaluations. He was previously an early engineer on Airbnb's Trust and Safety ML team, and studied computer science at Harvard.

surge ai logo

Data Labeling 2.0 for Rich, Creative AI

Superintelligent AI, meet your human teachers. Our data labeling platform is designed from the ground up to train the next generation of AI — whether it’s systems that can code in Python, summarize poetry, or detect the subtleties of toxic speech. Use our powerful data labeling workforce and tools to build the rich, human-powered datasets you need today.

Meet the world's largest
RLHF platform

Follow Surge AI!