Development of a COVID-19-Related Anti-Asian Tweet Data Set: Quantitative Study.
Maryam MokhberiAhana BiswasZarif MasudRoula Kteily-HawaAbby L GoldsteinJoseph Roy GillisShebuti RayanaSyed Ishtiaque AhmedPublished in: JMIR formative research (2023)
Our data set can be used as a benchmark for further qualitative and quantitative research and analysis around the issue. It first reaffirms the existence and significance of widespread discrimination and stigma toward the Asian population worldwide. Moreover, our data set and subsequent arguments should assist other researchers from various domains, including psychologists, public policy authorities, and sociologists, to analyze the complex economic, political, historical, and cultural underlying roots of anti-Asian stigmatization and hateful behaviors. A manually annotated data set is of paramount importance for developing algorithms that can be used to detect stigma or problematic text, particularly on social media. We believe this contribution will help predict and subsequently design interventions that will significantly help reduce stigma, hate, and discrimination against marginalized populations during future crises like COVID-19.