Improving random forest predictions in small datasets from two-phase sampling designs.
Sunwoo HanBrian D WilliamsonYouyi FongPublished in: BMC medical informatics and decision making (2021)
In small datasets from two-phase sampling design, variable screening and inverse sampling probability weighting are important for achieving good prediction performance of random forests. In addition, stacking random forests and simple linear models can offer improvements over random forests.