HARVESTMAN: a framework for hierarchical feature learning and selection from whole genome sequencing data.
Trevor S FrisbyShawn J BakerGuillaume MarçaisQuang Minh HoangCarl KingsfordChristopher James LangmeadPublished in: BMC bioinformatics (2021)
HARVESTMAN is a hierarchical feature selection approach for supervised model building from variant call data. By building a knowledge graph over genomic variants and solving an integer linear program , HARVESTMAN automatically and optimally finds the right encoding for genomic variants. Compared to other hierarchical feature selection methods, HARVESTMAN is faster and selects features more parsimoniously.