Login / Signup

Extracting intersectional stereotypes from embeddings: Developing and validating the Flexible Intersectional Stereotype Extraction procedure.

Tessa E S CharlesworthKshitish GhateAylin CaliskanMahzarin R Banaji
Published in: PNAS nexus (2024)
Social group-based identities intersect. The meaning of "woman" is modulated by adding social class as in "rich woman" or "poor woman." How does such intersectionality operate at-scale in everyday language? Which intersections dominate (are most frequent)? What qualities (positivity, competence, warmth) are ascribed to each intersection? In this study, we make it possible to address such questions by developing a stepwise procedure, Flexible Intersectional Stereotype Extraction (FISE), applied to word embeddings ( GloVe ; BERT ) trained on billions of words of English Internet text, revealing insights into intersectional stereotypes. First, applying FISE to occupation stereotypes across intersections of gender, race, and class showed alignment with ground-truth data on occupation demographics, providing initial validation. Second, applying FISE to trait adjectives showed strong androcentrism ( Men ) and ethnocentrism ( White ) in dominating everyday English language (e.g. White + Men are associated with 59% of traits; Black + Women with 5%). Associated traits also revealed intersectional differences: advantaged intersectional groups, especially intersections involving Rich , had more common, positive, warm, competent, and dominant trait associates. Together, the empirical insights from FISE illustrate its utility for transparently and efficiently quantifying intersectional stereotypes in existing large text corpora, with potential to expand intersectionality research across unprecedented time and place. This project further sets up the infrastructure necessary to pursue new research on the emergent properties of intersectional identities.
Keyphrases
  • genome wide
  • mental health
  • healthcare
  • autism spectrum disorder
  • case report
  • smoking cessation
  • gene expression
  • machine learning
  • big data
  • palliative care
  • single cell
  • health information
  • high intensity
  • data analysis