Learning from biomedical linked data to suggest valid pharmacogenes.
Kevin DalleauYassine MarzouguiSébastien Da SilvaPatrice RingotNdeye Coumba NdiayeAdrien CouletPublished in: Journal of biomedical semantics (2017)
We assembled a set of linked data relative to pharmacogenomics, of 2,610,793 triples, coming from six distinct resources. Learning from these data, random forest enables identifying valid pharmacogenes with a F-measure of 0.73, on a 10 folds cross-validation, whereas graph kernel achieves a F-measure of 0.81. A list of top candidates proposed by both approaches is provided and their obtention is discussed.