April 24, 2024
Q&A: How TikTok’s ‘black box’ algorithm and design shape user behavior
TikTok’s swift ascension to the upper echelons of social media is often attributed to its recommendation algorithm, which predicts viewer preferences so acutely it’s spawned a maxim: “The TikTok algorithm knows me better than I know myself.” The platform’s success was so pronounced it’s seemed to spur other social media platforms to shift their designs. When users scroll through X or Instagram, they now see more recommended posts from accounts they don’t follow.
Yet for all that influence, the public knows little about how TikTok’s algorithm functions. So Franziska Roesner, a University of Washington associate professor in the Paul G. Allen School of Computer Science & Engineering, set about researching both how that algorithm is personalized and how TikTok users engage with the platform based on those recommendations.
Roesner and collaborators will present two papers this month that mine real-world data to help understand the “black box” of TikTok’s recommendation algorithm and its impact.
Researchers first recruited 347 TikTok users, who downloaded their data from the app and donated 9.2 million video recommendations. Using that data, the team initially looked at how TikTok personalized its recommendations. In the first 1,000 videos TikTok showed users, the team found that a third to half of the videos were shown based on TikTok’s predictions of what those users like. The researchers will publish the first paper May 13 in the Proceedings of the ACM Web Conference 2024.
The second study, which the team will present May 14 at the ACM CHI Conference on Human Factors in Computing Systems in Honolulu, explored engagement trends. Researchers discovered that over the users’ first 120 days, average daily time on the platform increased from about 29 minutes on the first day to 50 minutes on the last.
UW News spoke with Roesner about how TikTok recommends videos; the impact that has on users; and the ways tech companies, regulators and the public might mitigate unwanted effects.
What is it important for us to understand about how TikTok’s algorithm functions?
Franziska Roesner: TikTok users often have questions like: “Why was I shown this content? What does TikTok know about me? How is it using what it knows about me? And is it?” So we looked at what TikTok shows people and by what criteria. If we better understand how the algorithm functions, then we can ask whether we like how it works.
For example, if the algorithm is exploiting people’s weaknesses around certain types of content, if it predicts that I’m more likely to be susceptible to a certain type of misinformation, it could be pushing me down certain rabbit holes that might be dangerous to me. Maybe they mislead me, or they exacerbate mental health challenges or eating disorders. The algorithm is such a black box, to the public and to regulators. And to some extent, it probably is to TikTok itself. It’s not like someone is writing code that’s targeting a person who’s vulnerable to an eating disorder. The algorithm is just making predictions from a bunch of data. So we as researchers are interested in the features that it is using to predict, because we can’t really understand if and why a prediction is problematic without understanding those.
We also looked at how people engage with TikTok’s algorithm as we understand it. These considerations go hand in hand. As a security and privacy person, I’m always really interested in how people interact with technologies and how their designs shape what we read and believe and share. So researching the human experience helps to understand the impact of the algorithm and the platform design.
What did you learn from these studies?
FR: One thing that surprised me a little was that those of us who use TikTok — and I do use TikTok — probably spend more time on it than we wish to admit. I was also a little surprised that people watch only about 55% of videos to the end. We debated whether this was high or low. Is this part of the platform’s design, that once you’ve got whatever you wanted to get out of this video you move on? Or is it a sign that even this highly tuned recommendation algorithm is not doing that well? I don’t know which it is. But it’s useful to at least have a baseline to compare future findings against.
Another important takeaway was looking at what features influence what videos the algorithm shows you. How much agency is TikTok potentially taking from us? How good is it at predicting what we’re likely to want to watch? How rabbit hole-y do those things get? In the study, we labeled each video within a user’s timeline as an “exploration video” or an “exploitation video.” An exploration video is not linked to videos that the user has seen before — for instance, there are no similar hashtags or creators. The idea is that there’s some value in the algorithm showing you new stuff. Maybe there’s societal value to not putting you down a rabbit hole. There’s also probably value for TikTok, because the more you see the same stuff, the more bored you get. They want to throw some spaghetti at the wall and see what sticks.
The exploitation videos are the ones that are more like, “We know what you like, we’re going to show you more videos that are related to these.” In the study, we looked at what fraction of the videos are explorative versus exploitative. We found that in the first 1,000 videos users saw, TikTok exploited users’ interests between 30% and 50% of the time. We then looked at how the videos differed and how TikTok treated them. For example, if you’re following someone, you’re significantly more likely to see videos from them. That’s probably not surprising. However, based on our data, scrolling past a video faster does not seem to impact as much what the algorithm is doing.
We also found that people finished watching the videos from accounts they were following less, but engaged with them more. We hypothesized that if someone sees a video from their friend, maybe they’re not that interested and don’t want to watch, but they still want to show support, so they engage.
In these papers you make several suggestions to mitigate the potential negative effects of TikTok’s design. Could you explain a few of those?
FR: We found that the data donations were not complete enough for us to be able to answer all the questions that we had. So there’s some lack of transparency in the data users could download and about the algorithm overall. We’ve seen this in other studies. People have looked at Facebook’s ad-targeting disclosures. If you ask why you’re seeing this ad, it usually offers the broadest criteria that were included — that you’re over 18 and in the United States, for instance. Yes, but also because you visited this product website yesterday. But the company isn’t sharing that. I’d like to see more transparency about how people’s data is used. Whether that would change what an individual would do is a different question. But I see it as the duty of the platform to help us understand that.
That also connects to regulation. Even if that information doesn’t change an individual’s behavior, it’s vital to be able to do studies that show, for example, how a vulnerable population is being disproportionately targeted with a certain type of content. That kind of targeting is not necessarily intentional, but if you don’t know that’s happening, you can’t stop it. We don’t know how these platforms are auditing internally, but there’s always a value in having external auditors with different incentives.
Before we had these platforms, we understood more about how certain content got to certain people because it came in newspapers or on billboards. Now we have a situation where everybody’s got their own little reality. So it’s hard to reason about what people are seeing and why and how that all fits together — let alone what to do about it — if we can’t even see it.
What is important for people to know about TikTok?
FR: Awareness is helpful. Remember that the platform and the algorithm kind of shape how you view the world and how you interact with the content. That’s not always bad, that can be good. But the platform designs are not neutral, and they influence how long you watch and what you watch, and what you’re getting angry or concerned about. Just remember that the algorithm shows you stuff in large part because it’s predicting what you might want to see. And there are other things you’re not seeing.
Additional co-authors on the papers included Karan Vombatkere of Boston University; Sepehr Mousavi, Olivia Nemes-Nemeth, Angelica Goetzen and Krishna P. Gummadi of Max Planck Institute for Software Systems; Oshrat Ayalon of University of Haifa and Max Planck Institute for Software Systems; Savvas Zannettou of TU Delft; and Elissa M. Redmiles of Georgetown University.
For more information, contact Roesner at franzi@cs.washington.edu.
Tag(s): College of Engineering • Franziska Roesner • Paul G. Allen School of Computer Science & Engineering