Login / Signup

A natural language processing framework to analyse the opinions on HPV vaccination reflected in twitter over 10 years (2008 - 2017).

Xiao LuoGregory D ZimetSetu Shah
Published in: Human vaccines & immunotherapeutics (2019)
In this research, we developed a natural language processing (NLP) framework to investigate the opinions on HPV vaccination reflected on Twitter over a 10-year period - 2008-2017. The NLP framework includes sentiment analysis, entity analysis, and artificial intelligence (AI)-based phrase association mining. The sentiment analysis demonstrates the sentiment fluctuation over the past 10 years. The results show that there are more negative tweets in 2008 to 2011 and 2015 to 2016. The entity extraction and analysis help to identify the organization, geographical location and events entities associated with the negative and positive tweets. The results show that the organization entities such as FDA, CDC and Merck occur in both negative and positive tweets of almost every year, whereas the geographical location entities mentioned in both negative and positive tweets change from year to year. The reason is because of the specific events that happened in those different locations. The objective of the AI-based phrase association mining is to identify the main topics reflected in both negative and positive tweets and detailed tweet content. Through the phrase association mining, we found that the main negative topics on Twitter include "injuries", "deaths", "scandal", "safety concerns", and "adverse/side effects", whereas the main positive topics include "cervical cancers", "cervical screens", "prevents", and "vaccination campaigns". We believe the results of this research can help public health researchers better understand the nature of social media influence on HPV vaccination attitudes and to develop strategies to counter the proliferation of misinformation.
Keyphrases
  • social media
  • artificial intelligence
  • public health
  • machine learning
  • health information
  • big data
  • high grade
  • deep learning
  • signaling pathway
  • gene expression
  • mental health
  • genome wide
  • young adults