On his personal blog called Remains of the Day, former Amazon and Oculus product executive Eugene Wei wrote Seeing Like an Algorithm, an in-depth analysis of the TikTok algorithm explaining exactly what makes it so special (and performant). In order to help those of you who have been on a mission to try and understand TikTok’s recommendation engine, we rounded-up the key takeaways from Wei’s analysis. Here’s how the design of TikTok helps its algorithm work as well as it does.
Although many link TikTok’s impressive success to its For You page, which isn’t wrong in itself, the video-sharing app relies first and foremost on its algorithm. After all, without the perfect algorithm, even the For You page would collapse. “Understanding how the algorithm achieves its accuracy matters even if you’re not interested in TikTok or the short video space because more and more, companies in all industries will be running up against a competitor whose advantage centres around a machine learning algorithm,” explains Wei.
There, we’ve said it: TikTok’s actual machine learning (ML) recommendation algorithm isn’t out of the ordinary. As Wei explains, “Most experts in the field doubt that TikTok has made some hitherto unknown advance in machine learning recommendations algorithms. In fact, most of them would say that TikTok is likely building off of the same standard approaches to the problem that others are.”
While this statement might throw you off a bit, keep in mind that the effectiveness of a ML algorithm isn’t a function of the algorithm alone but of the algorithm after it is trained on some dataset. Once trained on an enormous volume of data with a massive number of parameters, its output is often mind-blowing.
The TikTok For You page algorithm, trained on its dataset, is remarkably accurate and efficient at matching videos with those who will find them entertaining. It is also just as good at suppressing the distribution of videos to those who won’t find them entertaining.
TikTok needed an algorithm that would excel at recommending short videos to viewers, and when it first launched, “no such massive publicly available training dataset existed.” Even though you could find short videos of memes, kids lip-synching, dancing, cute pets—and on and on—you weren’t able to find comparable data on how the general population felt about such videos. “Outside of Musical.ly’s dataset, which consisted mostly of teen girls in the US lip-synching to each other, such data didn’t exist,” writes Wei.
Knowing that the app faced the problem in which the very types of video its algorithm needed to train on weren’t easy to create without the app’s camera tools and filters, how did TikTok manage to get the most valuable inputs possible for its algorithm?
TikTok’s design offers only one video at a time with a number of indicators as to whether or not the user likes it (length of viewing, re-watches, likes, comments, song choice, video subject, shares). This closed-loop of feedback then inspires and enables the creation and viewing of videos on which its algorithm can be trained. It’s as simple as that: “for its algorithm to become as effective as it has, TikTok became its own source of training data,” clarifies Wei.
Typically, user experience (UX) design is meant to be user-friendly. However, to improve its algorithm, TikTok has made its product a tiny bit less user-friendly, as users having to scroll through multiple pieces of content on apps such as Twitter or Facebook is a more ‘frictionless’ experience than just a single video view like on TikTok. The dominant school of thought in tech has centred around removing friction for users in accomplishing whatever it is they’re trying to do.
When you open the TikTok app, it takes you straight to the For You page and goes right into a video, which fills your entire screen. This is not a scrolling feed. It’s paginated, effectively, explains Wei. The video autoplays almost immediately (and the next few videos are loaded in the background so that they, too, can play quickly when it’s their turn on stage).
This design introduces the user to an immediate question: how do you feel about this short video and only this short video? If you watch it more than once, then you liked something about it. If you shared it with someone, then you must have felt something special there too. If you tapped on the spinning track and watched more videos using that song, this is also an indicator that something in that specific video appealed to you. All the information listed above is what TikTok’s algorithm ‘feeds’ on.
With such clear signals—whether positive or negative—TikTok can quickly understand a user’s preference and serve up more similar content. This in turn creates a tight feedback loop and kicks off the flywheel that continually improves TikTok’s recommendations and data inputs.
But even before gaining all that information, the app’s algorithm looks at the video by itself and classifies it. It knows what the video is about, what filters and songs it uses, if there’s food in it or simply human faces, hands or gestures. The algorithm’s vision AI starts by classifying all this.
So, even before you’ve watched a specific video, TikTok’s algorithm already knows the types of videos you have previously enjoyed, the demographic or psychographic information that is known about you, where you’re watching the video, the type of device you have, and more. “Beyond that, it knows what other users are similar to you.”
After you’ve watched the specific video the algorithm has recommended on the For You page, it can now close all the feedback loops and take every one of your actions on the video and can guess how you, with all your tastes, feel about this video, with all its attributes.
Now, how is this so different from other social media platforms such as Facebook, Twitter and Instagram?
Now, if you compare TikTok’s UI with a traditional social feed that offers an endless scroll of content, you’ll notice that the user inputs are less clear. Instead of serving you one story at a time like TikTok, these apps display multiple items on screen at once. “As you scroll up and past many stories, the algorithm can’t ‘see’ which story your eyes rest on. Even if it could, if the user doesn’t press any of the feedback buttons like the Like button, is their sentiment towards that story positive or negative? The signal of user sentiment isn’t clean,” explains Wei.
In both cases, infinite scrolling feeds are ideal. However, TikTok took it one step further by allowing users to only watch one video at a time, gaining clearer data from this and then feeding it back to its algorithm, making it more competent than ever. “If you click into a text post by someone on Facebook but don’t comment or like the post, how can Facebook judge your sentiment toward that post?” asks Wei.
That’s also why some networks that are built around interest graphs like Reddit have incorporated down voting mechanisms in order to serve users the most interesting content. That means weeding out uninteresting content as much as it does surfacing appealing content. Although TikTok doesn’t have a downvote button, by showing you one video at a time, it can notice your lack of interest depending on how quickly you swipe to the next one or which positive actions you take or don’t take.
Wei states that “Triller may pay some influencers from TikTok to come over and make videos there, Reels might try to draft off of existing Instagram traffic, but what makes TikTok work is the entire positive feedback loop connecting creators, videos, and viewers via the For You page algorithm.” You heard the man, competition will struggle before it gains the same popularity TikTok has.
TikTok, the short-form video app that’s reached more than 2 billion downloads in April 2020 has become the number one social network for gen Z. And while there are a lot of issues with TikTok, from its racist content moderation to its compliance in spreading hate and violence in India, the app has an even bigger problem it won’t be able to avoid soon: its ties to China. We look at the short-form video app and the data privacy problems it represents to explain to you why you should be wary of what you post on it.
TikTok originated as Musical.ly, a very similar app where young teens would post short videos of themselves lip-synching to accelerated pop songs, which was launched in 2014. After a few years of success, Musical.ly was bought by the Chinese internet company ByteDance in 2017 and relaunched as TikTok. In August 2018, ByteDance migrated all the Musical.ly accounts over to TikTok, allowing the app to start with an already impressive number of users (around 680 million monthly active users).
Since then, TikTok just kept on growing, making ByteDance the world’s most valuable startup, estimated to be valued at $110 billion. Over the past two years, TikTok has become the defining social media app of gen Zers around the world.
Technically, TikTok is owned by the Chinese company ByteDance, not China nor its government. But this question is understandable considering the app has been accused of censoring content that mentioned topics sensitive to the Communist Party of China.
It all started with an investigation published by The Guardian in September 2019, which revealed leaked documents that showed TikTok instructing its moderators to censor videos that mentioned topics such as the Tiananmen Square protests, the Tibetan independence, the religious group Falun Gong, and many others—in short, any topic considered sensitive by the Chinese government.
The Guardian’s investigation was conducted after the Washington Post noticed that a search for Hong Kong-related topics on TikTok showed zero content about the ongoing pro-democracy protests. This topic, which was highly covered on other social media platforms at the time, clearly seemed to be censored on the Chinese-owned app.
In October 2019, Senator Marco Rubio called for an investigation into whether TikTok poses a national security risk to the US. In a letter addressed to US Department of Treasury Secretary Steven Mnuchin, Rubio wrote “These Chinese-owned apps are increasingly being used to censor content and silence open discussion on topics deemed sensitive by the Chinese Government and Community Party. The Chinese government’s nefarious efforts to censor information inside free societies around the world cannot be accepted and pose serious long-term challenges to the US and our allies.”
Shortly after this, two other US senators followed suit. Senators Chuck Schumer and Tom Cotton called for a “rigorous assessment” of the potential national security risks of TikTok by US intelligence officials. Both expressed concern that the app could be a target of foreign influence campaigns like those during the 2016 election and made the point that Chinese companies are required to adhere to Chinese law, which grants the government a worrying access to the data of private companies.
The letter addressed to acting Director of National Intelligence Joseph Maguire read “Without an independent judiciary to review requests made by the Chinese government for data or other actions, there is no legal mechanism for Chinese companies to appeal if they disagree with a request.”
These three demands were finally heard in November 2019, when the federal Committee on Foreign Investment in the United States (CFIUS), which investigates potential national security implications of foreign acquisitions of US companies, announced it would be launching a review of ByteDance’s acquisition of Musical.ly.
While the specifics of the investigation are kept secret, a source familiar with the matter told the New York Times that the US government had evidence of TikTok sending US user data to China. TikTok denied these allegations. In October 2019, the company published a blog post titled Statement on TikTok’s content moderation and data security practices which stated that it keeps all US user data in the US, along with using a backup server in Singapore. TikTok also precised that none of it is subject to Chinese law.
“Let us be very clear: TikTok does not remove content based on sensitivities related to China. We have never been asked by the Chinese government to remove any content and we would not do so if asked. Period. Our US moderation team, which is led out of California, reviews content for adherence to our US policies – just like other US companies in our space,” reads the statement.
TikTok has said the lack of political content on the social media app is only related to its audience’s interests and demands. According to the app, TikTokers mainly use it for positive and joyful entertainment rather than politics. But as we’ve seen in India for example, the app’s content regulation (or lack of) resulted in the spread of racist and violent messages against specific groups.
TikTok’s moderation guidelines faced further scrutiny in November 2019 when it suspended student Feroza Aziz’s account for posting a 3 videos about the Chinese oppression of its Uighur Muslim population. TikTok claimed it did not suspend Aziz’s account for its content but said instead her videos were removed due to a human moderation error.
Just after that, the German publication Netzpolitik leaked some of the app’s moderation guidelines that showed moderators are instructed to label any political content as either “not recommended” or “not for feed,” meaning videos can’t show up on a user’s For You page or will be more difficult to discover in TikTok’s search.
While the app’s moderation guidelines seem to have slightly changed over the months, TikTok still presents problems in the way it regulates content. In TikTok’s early days, censorship was strong as the app attempted to keep the content ‘light and fun’. As the platform started taking off in new markets, it said it began working to empower local teams. And yet, still to this day, many videos are being censored for what seems to be the wrong reasons.
Remember when everyone freaked out after it was revealed that the ageing app FaceApp was based in Russia? People were worried the user data the app collected could be used for other purposes. And although it should be said that perhaps it is time we all start worrying about the many apps we use on a daily basis, not just the ones based in China and Russia, well TikTok should be first on that long list.
A 2017 Chinese law requires Chinese companies to comply with government intelligence operations if asked, which means that companies based in China can do close to little should the government request to access data.
Keeping this in mind, another problem appears. Should the Chinese government get access to TikTok’s user data, it remains unclear exactly what the Chinese Communist Party might do with it. As analyst at the Australian Strategic Policy Institute Samantha Hoffman told The Verge “China collects bulk data overseas and then uses it to help with things that relate to state security like propaganda and identifying public sentiment to understand how people feel about a particular issue. It’s about controlling the media environment globally. Once you have control, you can use it to influence and shape the conversation.”
It is known that China holds great control over what its citizens can (and cannot) access online. So, what if it had the chance to control other countries’ content? Could it then also influence and shape the conversation? That’s exactly why TikTok’s owner company ByteDance being China-based is a problem—for the US but also for other countries.
While this might sound like another conspiracy theory to some, history has shown us that the Chinese government doesn’t always mobilise its forces to harm freedom of expression straight away. Instead, it waits until something threatening happens to take actions.
In March 2020, Reddit user Bangorlol made a comment in a TikTok thread where he claimed to have successfully reverse-engineered the app and shared what he learned about the Chinese video-sharing social network. He strongly advised people to never use the app again, warning them about TikTok’s intrusive user-tracking and other issues.
According to Bangorlol, TikTok would be collecting information about its users including the apps installed on their phones, any information about their own network, locations and more. This recent claim could confirm older ones made by US college student Misty Hong in December 2019.
Hong had filed a lawsuit against TikTok for allegedly transferring her private data to servers in China. Data included users’ locations, ages, private messages, phone numbers, contacts, genders, browsing histories, phone serial numbers and IP addresses. Though a previous version of TikTok’s privacy policy stated that user data could be sent to China, the suit alleges that the company did so even after that policy changed.
Hong also claimed that she downloaded the app in 2019 but never made an account. Instead, she said the app automatically created one for her by using her phone number and creating a file of videos that she never posted, including a scan of her face. TikTok would have then transferred that information to two servers in China, bugly.qq.com and umeng.com, the former of which is owned by Tencent, owner of the Chinese social network WeChat, and the latter is owned by Chinese e-commerce giant Alibaba.
Back to 2020 and ByteDance has slowly started shifting TikTok’s power away from China.
At the end of May 2020, ByteDance announced that Disney’s Kevin Mayer would become TikTok’s new CEO in a move to shift its centre of power away from China. The company also transferred global decision-making and research capabilities out of its home country, expanded TikTok’s engineering and research and development operations in Mountain View, California, by hiring more than 150 engineers there according to Reuters.
These changes came at a time of heightened tension between the US and China over trade, technology and the COVID-19 pandemic, as well as the intense scrutiny of TikTok.
Although ByteDance has recently made a series of moves to transfer TikTok’s centre of power away from China, it could still be forced to sell off the video-sharing app. In 2016, when the dating app Grindr was sold to the Chinese company Kunlun, CFIUS then determined in March 2019 that its ownership caused a national security risk. In March 2020, Kunlun sold Grindr for about $608.5 million to San Vicente Acquisition.
As the world awaits for more changes in TikTok’s content moderation approach as well as its Chinese ownership, there’s only one thing you can do as a user in order to protect your personal data: review the app’s privacy policy. Stopping yourself from using the app completely might also help, but we won’t ask that of you.