Imagine you’re a dying app. The cool kids used to adore you, but they’ve grown up and the new generation has moved on. The only ones left are your parents and boomers.
Luckily, you still have piles of cash and a crack team of data scientists. Surely you can A/B test your way to coolness and the hearts of teenagers again?
So you pick a metric to optimize – video watch time – and set your teams to work. Your engineers lock themselves in a war room, and emerge with 10 experiments designed to increase it. Time passes. Your exec team sees more and more time spent watching videos! They clap their hands and treat themselves to a day of golf.
And yet each time you launch a change that boosts all your metrics, your 16-year old daughter only rolls her eyes more and more…
The problem: your app is only used by grandmothers. Your user base loves videos full of dad humor and posts for Metamucil – the exact opposite of teenage cool. So experiments that boost cringe content and annoying baby videos increase watch time; clips of high schoolers lip syncing to Dua Lipa decrease it. You’re stuck in the short-term optimization valley of A/B tests.
What’s a desperate social media CEO to do?
Meta’s Metrics Paradox
Meta is facing existential challenges on all fronts – its first ever quarterly revenue decline, a shift to an unknown Metaverse, and the biggest threat of all: TikTok.
Evidence of the thrash in Menlo Park: a few weeks ago, it launched its standard weapon – a copycat redesign to make Instagram more TikTok-like – only to walk it back after user backlash.
Meta has always been incredibly analytical. A long-time Facebooker once told me that Zuckerberg doesn't trust his intuitions around human preferences, so puts all faith in data instead.
And yet this data-driven focus on metrics may be the problem. Are Meta’s metrics measuring the right thing? If not, what does it mean when their metrics go up?
- Imagine a search engine that optimizes for clicks. Surely, users clicking on search results is a sign of success! But do you really want Buzzfeed-style clickbait to rise to the top of the SERP? What do you think users will click on more when they search for "joe biden": conspiracy theories from the Daily Sun, or the facts of the day?
- In 2018, Facebook started optimizing its News Feed for likes and comments between your friends and family – in theory, a way to boost healthy engagement. In practice, people love engaging with toxic and misinformative content, and the change ended up boosting those more. After all, what do you associate with Facebook today: posts that make you a better, happier person, or political rants from your uncle?
Measuring video views and watch time is easy; understanding human preferences is hard. Just because an experiment increases video watch time, does it mean that it was successful at showing you delightful videos that make you laugh and teach you new things? Or simply that it was better at showing low-grade videobait?
Let’s take a lesson from Search. Modern search engines don’t decide which experiments to launch by picking the ones that generate the most clicks. They’d degenerate into The Daily Mail if they did. Instead, they send tens of thousands of query-result pairs to large-scale teams of human raters, who score their relevance and quality directly.
What if News Feeds and discovery engines did the same? If Instagram could easily ask 1000 human raters to personally evaluate two versions of their feed – one with the TikTok inspiration, and one without – would it have launched its TikTok clone?
We decided to run this experiment ourselves! Human evaluation of AI systems is a flagship application of our platform, and we asked 100 Surgers to compare the content that TikTok and Instagram Reels delivered into their feeds.
Which app won in the battle of short-form video content, and why? Let’s dive in.
Injecting A Human Touch: Evaluation of TikTok vs. Instagram Reels
We asked 100 Gen Z Surgers to open up TikTok and Reels. They then rated the first 10 videos on each app along several dimensions (e.g., interestingness, humor, serendipity, and diversity) and scored their overall experience.
The result: 9.6x as many users preferred TikTok to Instagram Reels!
- 61% found TikTok much better
- 16% found TikTok slightly better
- 15% found them about the same
- 3% found Instagram Reels slightly better
- 5% found Instagram Reels much better
Surgers Explain TikTok’s Popularity: “I laughed and cried and smiled practically non-stop”
Why was TikTok so much better? Here are Surgers explaining in their own words.
Surger: Jesus F.
I feel like TikTok's algorithm understands what I like a lot more than Instagram reels. Instagram reels show me a lot of 'garbage' content. Lots of old comedy skits with humor that was relevant on Tiktok months or even years ago.
As a part of Gen Z, I relate a lot more with TikTok creators in general. Even the ads on Tiktok don't bother me as much.
Instagram Reels is slowly becoming an inside joke for people my age. When I just want to relax and find funny and genuinely enjoyable content, Tiktok is my first choice.
Surger: Emily S.
TikTok showed me content that was related to my interests, humor, and culture/location. Instagram showed me generic popular videos with thousands of likes.
Several Instagram videos were performative and annoying - including, for example, ridiculous content of a woman spinning on a pottery wheel or one womanʻs monthly expenses living in the Bay Area ($10,340) with the caption "being alive is expensive" (which is insensitive to people who struggle to make a living).
While I have not been on TikTok in about a year, my "for you" page still gave me content that was related to me and my interests. In a ~5-minute span, I liked plenty of videos on TikTok, while I did not like any videos on Instagram.
I also laughed out loud at one of the TikToks, which surprised me!
Surger: Nicholas H.
While both offer quick videos, Instagram Reels lacks substance. There were many reels that were short, but with no plot - no release of tension, explanation, or punchline. They were simply something to watch. Tiktok, in contrast, offers both short and medium-ish format videos that take you on a journey.
Instagram reels seem like bad cropped/bad edited versions of tiktoks. Physically scrolling through reels is more difficult and the sensitivity there isn’t as smooth as tiktok. Reels feels like a cheap copy, in all ways.
Content creators also have more of an incentive to post to tiktok than reels, where tiktoks go to die.
Surger: Sharon H.
The videos shown to me on Instagram Reels had a boring sameness to them. TikTok videos were more varied.
Instagram Reels seem more interested in showing me videos that it perceives as "cool", which did not interest me at all and felt off-putting. In contrast, TikTok seems more interested in showing me videos that I would actually like.
Surger: Elizabeth M.
Tik Tok has a much better sense of what I'm into! I got a lot of comedy, beauty, dog content, 'day in my life', etc. Those are all the things I would normally look for while on these kinds of apps!
Instagram's content was definitely more generic. There was a lot of celebrity content, etc that seemed to just be content for my demographic (20-30 year old females) instead of 'for you'. There were also lots of mom blogs etc, but I'm not a mother.
Tik Tok felt more 'easygoing', like the content was light and there to make you laugh, although I did get some news stories.
I also felt like the quality (like video quality itself) was higher on Tik Tok. Several of the Insta Reels I got were grainy/blurry.
I also saw a lot of reused content on Instagram Reels that I saw on Tik Tok a long time ago.
Surger: Cynthia M.
I spent almost 20 minutes trying to find one video that made me laugh on reels. While on TikTok I found many that were interesting and funny.
Reels had a lot of the same unfunny content that I did not enjoy over and over. Most of them were tiktoks from months ago that people were just reposting as reels. I finally found one reel that made me smile just a little before I gave up.
Reels does not seem to have the quality videos that TikTok has. I was very bored on it. Just swiping and swiping trying to find something funny. On the other hand TikTok has figured me out and has a plethora of videos that I love, from funny animals to conspiracy and mundane facts to dark humor.
Surger: Georgeia M.
I find that most of the videos on TikTok are funny, or they move me in some way. I don't laugh all that easily, but TikTok always has something to make me laugh.
I find that Instagram is much more commercial. Most of the reels are ads in disguise or influencers promoting themselves. Instagram is simply not as genuine as TikTok.
It seems like TikTok now is what Instagram used to be, before it was monetized to the point where very little is genuine on the platform.
Clearly TikTok has its share of influencers as well, but most of the videos are from regular people creating content and playing about with it rather than celebrities and companies like Instagram is. I find TikTok more relatable and less fake.
Surger: Vicki S.
TikTok was so much more entertaining and just enjoyable entirely as a whole. I laughed and cried and smiled practically non-stop while on TikTok. I actually even lost track of time on it.
On Instagram reels I kept trying to see if I had spent enough time yet. It was actually pretty boring, probably because I've used TikTok more and so it was more populated with things I'd look at, whereas Instagram seemed to be all travel destination focused or things that looked like ads. There was nothing funny or emotional. Really it was all just basically informational clips. I prefer to use TikTok much more if I have any time to kill.
Breaking down TikTok’s advantages
Let’s break down why TikTok is winning. When users rated TikTok as slightly better or much better:
- 55% of the time, they called out TikTok’s better recommendation algorithms, compared to the generic content that Instagram surfaced.
- 38% of the time, they called out the way TikTok showed them a greater freshness and diversity of content, and seemed to have a much greater catalog of high-quality videos.
- 32% of the time, they loved how TikTok was more funny and lighthearted, and described the emotions it elicited.
- 31% of the time, they enjoyed the way TikTok introduced them to new creators and new content they otherwise wouldn’t have discovered.
- 28% of the time, they preferred TikTok’s UI.
- 18% of the time, they found TikTok videos more substantive.
- 15% of the time, they found Instagram content too commercial and influencer-heavy, and disliked Instagram’s ads.
- 10% of the time, they disliked the old, recycled TikToks they found on Instagram.
- 7% of the time, they enjoyed TikTok’s comments more than Instagram’s.
How could Instagram fix these issues and catch up?
Part of Reels' problem is its limited content. Yet an even bigger problem is the way it optimizes its algorithms. Instead of depending on low-quality clicks, what if the Reels team regularly asked sets of millennial Instagram raters to label which algorithms lead to fresher and more delightful sessions, which videos make them laugh, and which feeds serendipitously surface new creators that inspire them – and then train its discovery systems on these rich, human-powered datasets? Could Instagram capture TikTok's magic then?
These techniques are possible – just consider the how YouTube made parts of its site wholesome and funny too.
A Visceral Look at Amazing and Terrible Videos
Let’s look at example videos to get a more visceral understanding of what’s going on. Here are a few TikToks that Surgers loved and rated at the top of the scale.
I really enjoyed this video because I love dogs. I love that it was a cute, heartwarming video about a dog patiently waiting for its owner and getting all excited when its owner came down the bus. I like that it is a really simple video without any embellishments. It just made me happy to watch a video like this.
I follow a lot of cooking content creators because cooking is one of my main hobbies. This guy is a creator that I really like who appeared on my For You page one day, and he did a nice video making Swedish meatballs from a series he is doing on comfort food from around the world. I love this because it introduces me to new types of food that I’d never discover otherwise!
This is just an adorable video (and yes there is a celebrity in it, but this is not promotional). The little boy is spot on in his dancing, it is wholesome and it makes you smile. The video does appear to be genuine and spontaneous rather than staged. It is one of those videos that are good to start your morning with, because it makes you feel good, and not think too hard. Just good fun.
She talks about a controversial topic, showing numbers and facts (I don’t think I’ve ever seen that on Instagram). But also she talks about that topic from her own experience as an insider. Overall good content to think, reflect and form an opinion, not just scroll endlessly.
It was very quick and somewhat predictable (in a good way). But it wasn't a random outburst of "comedy" – it had setup, tension, and a release of tension, like it was intuitively following the best comedians, rather than cheap, lowbrow humor. It's like TikTok is Game of Thrones and Instagram is Here Comes Honey Boo Boo.
This video was annoying because she was mouthing the words to a song, and it felt performative and attention-seeking. Sheʻs also on the floor in a nice outfit with hair and makeup done, while she is supposedly about to paint the floor. I feel like on TikTok, if I were to see this video, it would be someone ACTUALLY about to paint the floor (and be very good at it), not some silly influencer.
I didn't understand the purpose of the video, why she was painting the floor, and why it's such a big deal to defy her husband and realtors. And then I looked at the caption and apparently she was kidding! I don't understand the thought process behind the video.
I did not like this video because parenting videos are not something that I enjoy watching. I hate baby videos, and I did not find this video entertaining or informative at all. I am not a parent and do not plan to be one, so I personally do not understand the supposed humor in this video. The “Edit to add” caption also seems pretentious and annoying.
A canonical Instagram Reel. The video specs don’t even fit the screen and the resolution of the video was terrible. Also it has text all over the content. Looks like the junk you see on Facebook Watch.
I don't like that this video is spreading sensational lies. Propaganda on social media is rampant now. I don't like how falsehoods are spread so easily on the internet, but the truth doesn't seem to be as contagious. I don't mind people making controversial points as long as their evidence to back them up. This showed a clip of Fox News criticizing the FBI while the FBI was simply doing their job. The video is far too biased for me.
This is the exact type of content that makes me avoid Instagram Reels more and more these days.
The girl in this video is claiming the gym she attends is superior in a way, because Sandra Bullock goes there. But in the next part we see the girl who walks behind her is just a girl who looks like Sandra Bullock, so it's not actually the actress.
Which makes the whole video completely pointless and absurd. Not funny or informative in any way. May be entertaining for the kind of people who like obvious humor and laugh tracks, but to me it just feels fake, especially since the whole goal is to get you to “follow @bodysmartfitness”.
It's not even that this video is bad quality, it's just that it's so old. This video itself is a few years old, but it's also been a 'sound trend' on Tik Tok for a while, so I'm just tired of hearing it. I think that's one big complaint most people have about Instagram Reels - it's 'recycled' content!
TikTok: wholesome and full of laughter. Reels: performative and recycled.
Let's think about these examples. On TikTok, Surgers found wholesome videos that make them smile, cooking videos that introduce them to new cuisines around the world, and videos with actual movement and plot. On Instagram, they found attention-seeking influencers, recycled content, baby videos, and propaganda.
The problem: these are exactly the videos that users who still use Instagram use it for!
Imagine, for example, running across an Instagram video introducing new cuisine from around the world. Is it more likely to be genuine content that teaches you something new about that food and culture – or a beauty influencer photoshopping herself into exotic locales? Which do you think Instagram users would click on more?
It's time to move beyond low-grade engagement-bait metrics. If Instagram wants to capture TikTok's magic, it needs to jump outside its short-term optimization valleys, and optimize for the human values that make us laugh, and cry, and fall in love again every time.
Where's the human touch?
At Surge AI, we're a data labeling platform for the richness of AI – whether it’s training large language models to align with human values, evaluating Search algorithms for relevance and quality, or building content moderation datasets that tease out the nuances of language.
Interested in learning more about human evaluation and social media? Follow us on Twitter @HelloSurgeAI, and check out our other blog posts!
Data Labeling 2.0 for Rich, Creative AI
Superintelligent AI, meet your human teachers. Our data labeling platform is designed from the ground up to train the next generation of AI — whether it’s systems that can code in Python, summarize poetry, or detect the subtleties of toxic speech. Use our powerful data labeling workforce and tools to build the rich, human-powered datasets you need today.