Canonical and DLPNO-Based G4(MP2)XK-Inspired Composite Wave Function Methods Parametrized against Large and Chemically Diverse Training Sets: Are They More Accurate and/or Robust than Double-Hybrid DFT?
Emmanouil SemidalasJan M L MartinPublished in: Journal of chemical theory and computation (2020)
The large and chemically diverse GMTKN55 benchmark was used as a training set for parametrizing composite wave function thermochemistry protocols akin to G4(MP2)XK theory (Chan, B.; Karton, A.; Raghavachari, K. J. Chem. Theory Comput. 2019, 15, 4478-4484). On account of their availability for elements H through Rn, Karlsruhe def2 basis sets were employed. Even after reparametrization, the GMTKN55 WTMAD2 (weighted mean absolute deviation, type 2) for G4(MP2)-XK is actually inferior to that of the best rung-4 DFT functional, ωB97M-V. By increasing the basis set for the MP2 part to def2-QZVPPD, we were able to substantially improve performance at modest cost (if an RI-MP2 approximation is made), with WTMAD2 for this G4(MP2)-XK-D method now comparable to the better rung-5 functionals (albeit at greater cost). A three-tier approach with a scaled MP3/def2-TZVPP intermediate step, however, leads to a G4(MP3)-D method that is markedly superior to even the best double hybrids ωB97M(2) and revDSD-PBEP86-D4. Evaluating the CCSD(T) component with a triple-ζ, rather than split-valence, basis set yields only a modest further improvement that is incommensurate with the drastic increase in computational cost. G4(MP3)-D and G4(MP2)-XK-D have about 40% better WTMAD2, at similar or lower computational cost, than their counterparts G4 and G4(MP2), respectively: detailed comparison reveals that the difference lies in larger molecules due to basis set incompleteness error. An E2/{T,Q} extrapolation and a CCSD(T)/def2-TZVP step provided the G4-T method of high accuracy and with just three fitted parameters. Using KS orbitals in MP2 leads to the G4(MP3|KS)-D method, which entirely eliminates the CCSD(T) step and has no steps costlier than scaled MP3; this shows a path forward to further improvements in double-hybrid density functional methods. None of our final selections require an empirical HLC correction; this cuts the number of empirical parameters in half and avoids discontinuities on potential energy surfaces. G4-T-DLPNO, a variant in which post-MP2 corrections are evaluated at the DLPNO-CCSD(T) level, achieves nearly the accuracy of G4-T but is applicable to much larger systems.