Login / Signup

Rethinking domain adaptation for machine learning over clinical language.

Egoitz LaparraSteven BethardTimothy A Miller
Published in: JAMIA open (2020)
Building clinical natural language processing (NLP) systems that work on widely varying data is an absolute necessity because of the expense of obtaining new training data. While domain adaptation research can have a positive impact on this problem, the most widely studied paradigms do not take into account the realities of clinical data sharing. To address this issue, we lay out a taxonomy of domain adaptation, parameterizing by what data is shareable. We show that the most realistic settings for clinical use cases are seriously under-studied. To support research in these important directions, we make a series of recommendations, not just for domain adaptation but for clinical NLP in general, that ensure that data, shared tasks, and released models are broadly useful, and that initiate research directions where the clinical NLP community can lead the broader NLP and machine learning fields.
Keyphrases
  • machine learning
  • big data
  • electronic health record
  • artificial intelligence
  • autism spectrum disorder