Login / Signup

Sampling Twitter users for social science research: evidence from a systematic review of the literature.

Paula Vicente
Published in: Quality & quantity (2023)
All social media platforms can be used to conduct social science research, but Twitter is the most popular as it provides its data via several Application Programming Interfaces, which allows qualitative and quantitative research to be conducted with its members. As Twitter is a huge universe, both in number of users and amount of data, sampling is generally required when using it for research purposes. Researchers only recently began to question whether tweet-level sampling-in which the tweet is the sampling unit-should be replaced by user-level sampling-in which the user is the sampling unit. The major rationale for this shift is that tweet-level sampling does not consider the fact that some core discussants on Twitter are much more active tweeters than other less active users, thus causing a sample biased towards the more active users. The knowledge on how to select representative samples of users in the Twitterverse is still insufficient despite its relevance for reliable and valid research outcomes. This paper contributes to this topic by presenting a systematic quantitative literature review of sampling plans designed and executed in the context of social science research in Twitter, including: (1) the definition of the target populations, (2) the sampling frames used to support sample selection, (3) the sampling methods used to obtain samples of Twitter users, (4) how data is collected from Twitter users, (5) the size of the samples, and (6) how research validity is addressed. This review can be a methodological guide for professionals and academics who want to conduct social science research involving Twitter users and the Twitterverse.
Keyphrases
  • social media
  • health information
  • healthcare
  • public health
  • electronic health record
  • case report
  • high resolution
  • metabolic syndrome
  • clinical trial
  • mass spectrometry
  • adipose tissue
  • machine learning
  • deep learning