Login / Signup

Automatic classification of histopathological diagnoses for building a large scale tissue catalogue.

Robert ReihsHeimo MüllerStefan SauerKurt Zatloukal
Published in: Health and technology (2016)
In this paper an automatic classification system for pathological findings is presented. The starting point in our undertaking was a pathologic tissue collection with about 1.4 million tissue samples described by free text records over 23 years. Exploring knowledge out of this "big data" pool is a challenging task, especially when dealing with unstructured data spanning over many years. The classification is based on an ontology-based term extraction and decision tree build with a manually curated classification system. The information extracting system is based on regular expressions and a text substitution system. We describe the generation of the decision trees by medical experts using a visual editor. Also the evaluation of the classification process with a reference data set is described. We achieved an F-Score of 89,7% for ICD-10 and an F-Score of 94,7% for ICD-O classification. For the information extraction of the tumor staging and receptors we achieved am F-Score ranging from 81,8 to 96,8%.
Keyphrases
  • machine learning
  • big data
  • deep learning
  • artificial intelligence
  • healthcare
  • electronic health record
  • lymph node
  • health information
  • preterm infants
  • neoadjuvant chemotherapy
  • gestational age
  • data analysis