Interrater Reliability of mHealth App Rating Measures: Analysis of Top Depression and Smoking Cessation Apps.
Adam C PowellJohn B TorousSteven Richard ChanGeoffrey Stephen RaynorErik ShwartsMeghan ShanahanAdam B LandmanPublished in: JMIR mHealth and uHealth (2016)
We found wide variation in the interrater reliability of measures used to evaluate apps, and some measures are more robust across categories of apps than others. The measures with the highest degree of interrater reliability tended to be those that involved the least rater discretion. Clinical quality measures such as effectiveness, ease of use, and performance had relatively poor interrater reliability. Subsequent research is needed to determine consistent means for evaluating the performance of apps. Patients and clinicians should consider conducting their own assessments of apps, in conjunction with evaluating information from reviews.