Login / Signup

TwiMed: Twitter and PubMed Comparable Corpus of Drugs, Diseases, Symptoms, and Their Relations.

Nestor AlvaroYusuke MiyaoNigel Collier
Published in: JMIR public health and surveillance (2017)
We present a corpus that is unique in its characteristics as this is the first corpus for pharmacovigilance curated from Twitter messages and PubMed sentences using the same data selection and annotation strategies. We believe this corpus will be of particular interest for researchers willing to compare results from pharmacovigilance systems (eg, classifiers and named entity recognition systems) when using data from Twitter and from PubMed. We hope that given the comprehensive set of drug names and the annotated entities and relations, this corpus becomes a standard resource to compare results from different pharmacovigilance studies in the area of NLP.
Keyphrases
  • adverse drug
  • social media
  • drug induced
  • electronic health record
  • big data
  • emergency department
  • machine learning
  • artificial intelligence
  • data analysis
  • single cell
  • sleep quality