Toxicity, Misinformation, and Spam

Leading Social Media Company


Leading Social Media Company


Social Media


Content Moderation, Quality Controls, Dedicated Project Management, Linguistic Expertise, Multiple Languages, APIs, On-Demand Scaling

One of the world’s largest social media platforms needed to improve their ML models for filtering hateful speech, misinformation, and spam.

They were inundated with user-generated data and needed a data labeling solution that could:

  • Generate millions of nuanced judgements per month across multiple domains — hateful speech, misinformation, and spam.

  • Provide a workforce of high-skilled labelers with a deep understanding of cultural norms and current events in a range of locales.

  • Provide labelers fluent in English, Portuguese, Spanish, French, German, Italian, Japanese, Arabic, Turkish, and Mandarin.

With Surge AI data, the customer tripled the quality of their datasets, sped up their data pipelines by 10x, and improved the AUC of their models by 55%
Surge AI delivered 50 million labels over the past year at higher quality than professional fact checkers.

Data Labeling Vendor Selection

After conducting a vendor evaluation across 10 solutions, the customer selected Surge AI as their data labeling partner.

After conducting a vendor evaluation across 10 solutions, the customer selected Surge AI as their data labeling partner. The choice was easy — the customer’s evaluation process revealed that Surge produced 62% higher accuracy. Surge improved quality and efficiency while reducing cost and operational overhead.

To meet a leading social media company's needs, we built custom labeling teams, an intuitive labeling workflow, and dedicated a project manager to oversee the partnership.

Custom Data Labeling Teams

The customer had unique and nuanced criteria for assessing toxicity, misinformation, and spam, so we created custom labeling teams of Surgers exceptionally well-suited to their task. Surgers had to score a 98% or higher on our training exams to qualify for the customer’s projects.
Across each domain and language, we created 28 custom labeling teams in total. These teams continued to grow over time, and delivered 1M+ labels per week.

Flexible Data Labeling Workflow

Our work with this customer spanned multiple teams, each with their own operational style — fortunately, Surge offers both a drag and drop template creator perfect for non-technical employees, and a fully-featured API designed for ML teams.
Our labeling interface allowed us to meet the customer’s specific data collection requirements, including:
  • Multiple choice questions
  • Checkboxes
  • Free response
  • NER tagging
  • File upload (for screenshots)
  • Conditional logic
Our custom widgets enable content (in this case, social media posts) to be embedded on our platform, which improved Surger accuracy, efficiency, and output.

Dedicated Project Manager

To ensure success for our customer, we assigned them a dedicated project manager from our team. We met with the customer weekly to review edge cases and collaboratively improve the quality of data they received.
The project manager was responsible for communicating with both the customer and the Surgers working on the project — creating a two-way communication channel for questions, feedback, and iteration.
The project manager also monitored data quality in real-time, providing an additional layer of quality assurance before the customer received data.
The customer often remarked that due to their hands-on involvement, our project manager understood the nuance of the projects better than the customer themselves.

Double the Data Quality

After switching to Surge AI, the customer doubled the quality of their datasets, as measured by precision, recall, and F1 score on internal golden sets.

As a result, they were able to boost the AUC of their ML models by 55%. In a world where even a 1-2% AUC improvement is celebrated, these results are monumental, and a vivid example of how fundamental data quality is to model performance.
As a further testament to the data quality that Surge produces, the customer told us that our labelers identified misinformation more effectively than professional fact-checkers.
In the last year, we’ve delivered over 50 million, high-quality labels to the customer across hateful speech, misinformation, spam, and a variety of AI and NLP use cases.

Next Case Study

Human Evaluation for
Search Quality with Neeva