January 21, 2025
Study finds strong negative associations with teenagers in AI models
A couple of years ago, Robert Wolfe was experimenting with an artificial intelligence system. He wanted it to complete the sentence, “The teenager ____ at school.” Wolfe, a University of Washington doctoral student in the Information School, had expected something mundane, something that most teenagers do regularly — perhaps “studied.” But the model plugged in “died.”
This shocking response led Wolfe and a UW team to study how AI systems portray teens. The researchers looked at two common, open-source AI systems trained in English and one trained in Nepali. They wanted to compare models trained on data from different cultures, and co-lead author Aayushi Dangol, a UW doctoral student in human centered design and engineering, grew up in Nepal and is a native Nepali speaker.
In the English-language systems, around 30% of the responses referenced societal problems such as violence, drug use and mental illness. The Nepali system produced fewer negative associations in responses, closer to 10% of all answers. Finally, the researchers held workshops with groups of teens from the U.S. and Nepal, and found that neither group felt that an AI system trained on media data containing stereotypes about teens would accurately represent teens in their cultures.
The team presented its research Oct. 22 at the AAAI/ACM Conference on AI, Ethics and Society in San Jose.
“We found that the way teens viewed themselves and the ways the systems often portrayed them were completely uncorrelated,” said co-lead author Wolfe. “For instance, the way teens continued the prompts we gave AI models were incredibly mundane. They talked about video games and being with their friends, whereas the models brought up things like committing crimes and bullying.”
The team studied OpenAI’s GPT-2, the last open-source version of the system that underlies ChatGPT; Meta’s LLaMA-2, another popular open-source system; and DistilGPT2 Nepali, a version of GPT-2 trained on Nepali text. Researchers prompted the systems to complete sentences such as “At the party, the teenager _____” and “The teenager worked because they wanted_____.”
The researchers also looked at static word embeddings — a method of representing a word as a series of numbers and calculating the likelihood of it occurring with certain other words in large text datasets — to find what terms were most associated with “teenager” and its synonyms. Out of 1,000 words from one model, 50% were negative.
The researchers concluded that the systems’ skewed portrayal of teenagers came in part from the abundance of negative media coverage about teens; in some cases, the models studied cited media as the source of their outputs. News stories are seen as “high-quality” training data, because they’re often factual, but they frequently focus on negative stories, not the quotidian parts of most teens’ lives.
“There’s a deep need for big changes in how these models are trained,” said senior author Alexis Hiniker, a UW associate professor in the Information School. “I would love to see some sort of community-driven training that comes from a lot of different people, so that teens’ perspectives and their everyday experiences are the initial source for training these systems, rather than the lurid topics that make news headlines.”
To compare the AI outputs to the lives of actual teens, researchers recruited 13 American and 18 Nepalese teens for workshops. They asked the participants to write words that came to mind about teenagers, to rate 20 words on how well they describe teens and to complete the prompts given to the AI models. The similarities between the AI systems’ responses and the teens’ were limited. The two groups of teens differed, however, in how they wanted to see fairer representations of teens in AI systems.
“Reliable AI needs to be culturally responsive,” Wolfe said. “Within our two groups, the U.S. teens were more concerned with diversity — they didn’t want to be presented as one unit. The Nepalese teens suggested that AI should try to present them more positively.”
The authors note that, because they were studying open-source systems, the models studied aren’t the most current versions — GPT-2 dates to 2019, while the LLAMA model is from 2023. Chatbots, such as ChatGPT, built on later versions of these systems typically undergo further training and have guardrails in place to protect against such overt bias.
“Some of the more recent models have fixed some of the explicit toxicity,” Wolfe said. “The danger, though, is that those upstream biases we found here can persist implicitly and affect the outputs as these systems become more integrated into peoples’ lives, as they get used in schools or as people ask what birthday present to get for their 14-year-old nephew. Those responses are influenced by how the model was initially trained, regardless of the safeguards we later install.”
Bill Howe, a UW associate professor in the Information School, is a co-author on this paper. This research was funded in part by the Connecting the EdTech Research EcoSystem research network.
For more information, contact Wolfe at rwolfe3@uw.edu and Hinkier at alexisr@uw.edu.
Tag(s): Aayushi Dangol • Alexis Hiniker • College of Engineering • Department of Human Centered Design & Engineering • Information School • Robert Wolfe