America's racial framework of superiority and Americanness embedded in natural language.
Messi H J LeeJacob M MontgomeryCalvin K LaiPublished in: PNAS nexus (2024)
America's racial framework can be summarized using two distinct dimensions: superiority/inferiority and Americanness/foreignness. We investigated America's racial framework in a corpus of spoken and written language using word embeddings. Word embeddings place words on a low-dimensional space where words with similar meanings are proximate, allowing researchers to test whether the positions of group and attribute words in a semantic space reflect stereotypes. We trained a word embedding model on the Corpus of Contemporary American English-a corpus of 1 billion words that span 30 years and 8 text categories-and compared the positions of racial/ethnic groups with respect to superiority and Americanness. We found that America's racial framework is embedded in American English. We also captured an additional nuance: Asian people were stereotyped as more American than Hispanic people. These results are empirical evidence that America's racial framework is embedded in American English.