The Use of Multivariate Generalizability Theory to Evaluate the Quality of Subscores.
Zhehan JiangMark RaymondPublished in: Applied psychological measurement (2018)
Conventional methods for evaluating the utility of subscores rely on reliability and correlation coefficients. However, correlations can overlook a notable source of variability: variation in subtest means/difficulties. Brennan introduced a reliability index for score profiles based on multivariate generalizability theory, designated as G , which is sensitive to variation in subtest difficulty. However, there has been little, if any, research evaluating the properties of this index. A series of simulation experiments, as well as analyses of real data, were conducted to investigate G under various conditions of subtest reliability, subtest correlations, and variability in subtest means. Three pilot studies evaluated G in the context of a single group of examinees. Results of the pilots indicated that G indices were typically low; across the 108 experimental conditions, G ranged from .23 to .86, with an overall mean of 0.63. The findings were consistent with previous research, indicating that subscores often do not have interpretive value. Importantly, there were many conditions for which the correlation-based method known as proportion reduction in mean-square error (PRMSE; Haberman, 2006) indicated that subscores were worth reporting, but for which values of G fell into the .50s, .60s, and .70s. The main study investigated G within the context of score profiles for examinee subgroups. Again, not only G indices were generally low, but it was also found that G can be sensitive to subgroup differences when PRMSE is not. Analyses of real data and subsequent discussion address how G can supplement PRMSE for characterizing the quality of subscores.