Start Your First Data Labeling Project

Surge AI background. Get high-quality datasets using Surge AI's elite workforce and labeling platform.
Start Your First Data Labeling Project

Follow along in this post as we walk step-by-step through creating and launching a labeling project on the Surge AI platform. We'll be using Surge's toy dataset, "Sample Dataset: Movie Reviews," so if you want to follow along with us, create an account here. It's free to make an account!

On your project dashboard, you can start a new project by clicking either the "Create Project" tab or the "New Project" tab.

You will be taken to the Project Properties Page. This is where you define these six elements of your project: 

1. Project name - Name your project!

We'll call ours "Movie Reviews."

Note: This is a public name, and qualified workers will see its title.

2. Data - Upload your CSV!

Each row on your spreadsheet will be displayed as a task for workers. We're going to use the toy dataset, "Sample Dataset: Movie Reviews," which you can select from the drop-down.

Note: If you need help formatting your CSV, let us know: [support@surgehq.ai]

3. Who will be working on this project - Use your own workers or our in-house workforce.

4. Payment per response - We recommend calculating payment on a time-to-complete basis.

We've calculated it will take about a minute to complete a task in our movie review dataset, and we'd like to pay $15/hr, so we'll pay $0.25 per response.

5. How many workers should label each row of data? - Exposing a task to multiple workers will introduce additional quality tools, such as inter-rater agreement. Let's go with 100 workers per row; we have 15 rows in our dataset, so that will be 1500 labels total.

6. Qualifications - Workers on the Surge platform are given qualifications according to their specific skills and knowledge. When you add a qualification to your project, only workers with that qualification can work on your data.

These movie reviews are in English, so let's add the "English - Fluent" qualification and "require" it for our workers (we could also exclude workers with this qualification by clicking "forbid").

Note: As tasks increase in complexity or nuance, apply additional qualifications to ensure you have the right workforce for your project. See our tutorial on building a Custom Qualification here.

Now we're ready to move on and format our instructions, data, and questions. Let's click continue!

Template Editor

You'll be taken to the template editor. This is where you format the instructions, data, and questions shown to the workers.

The Template Editor has three components: 

  • Instructions
  • Display Task Data
  • Questions

Instructions

Use our UI to create instructions, or insert a link to instructions hosted somewhere else, like a public Notion page or a Google Doc.

These buttons -- {{movie}}, {{review}}, and {{image}} -- allow us to easily insert variables from our CSV anywhere in the task. You will see different options depending on the variables in your CSV.

Tip: Creating clear, concise, and easily readable instructions is essential to the success of a project. Click here for our guide to creating instructions.

Display Task Data

This is where you format how the data is represented to the workers.

For each task, we want our workers to see a movie review and the movie's title, so we'll inject those variables from our CSV with {{movie}} and {{review}}.

Questions

This is where you create and format the questions you will pose to your workers about your data.

Questions can take one of these six formats:

  • Multiple Choice - Standard multiple-choice.
  • Checkboxes - Select one or several answers from a list that you define.
  • Free Response - A text box for workers to write a response.
  • Text Area - This is not a question. Selecting this option creates a text area where you can add additional instructions.
  • Bounding Boxes - Label objects using bounding boxes.
  • Image Annotation - Label images using more specific polygonal structures.
  • Named Entity Recognition - Text annotation.

For our movie review dataset, we'll have our workers answer a Multiple Choice question on the review's sentiment, then ask them to label positive and negative words in the review with the NER tool.

Notice that we've inserted the review into the NER box again with {{review}}.

We want our workers to identify faces in the screenshot we have in our dataset, so we'll add a Bounding Box question and insert our image with {{image}}. Let's also add a text area above the image to remind them what we're asking here.

Finally, we'll add a free response question asking them to write a review of their own. Let's make this question optional by toggling the slider at the bottom left of the box.

Now that we're done, let's preview what a task will look like. Clicking "preview how the task will appear to workers" at the top of the screen will show us a preview.

Looks pretty good! Now we can navigate back to the project page and review our project before adding funds and launching.

If we decide we want to build quality controls into our project, we should create Gold Standards and custom qualifications before we launch; see the links below for step-by-step guides.


Creating Custom Qualifications.

Creating Gold Standards.

...

If you have any questions about starting a project, reach out to us: [support@surgehq.ai]