Login / Signup

Maximum Pairwise Pseudo-Likelihood Estimation of the Covariance Matrix from Left-Censored Data.

Michael P JonesSarah S PerryPeter S Thorne
Published in: Journal of agricultural, biological, and environmental statistics (2014)
Toxicological studies often depend on laboratory assays that have thresholds below which environmental pollutants cannot be measured with accuracy. Exposure levels below this limit of detection may well be toxic and hence it is vital to use data analytic methods that handle such left-censored data with as little estimation bias as possible. In an on-going study for which our methodology is developed, levels of residential exposure to polychlorinated biphenyls (PCBs) and the interrelationships of their subtypes (congeners) are characterized. In any given sample many of the congeners may fall below the detection limit. The main problem tackled in this paper is estimation of mean exposure levels and corresponding covariance and correlation matrices for a large number of potentially left-censored measures that have very low bias and are computationally feasible. The proposed methods are likelihood based, using marginal likelihoods for means and variances and pairwise pseudo-likelihoods for correlations and covariances. In the simple bi- variate case, head-to-head comparisons show the proposed methods to be computationally more stable than ordinary maximum likelihood estimates (MLEs) and still maintain comparable bias. When the number of variables is much larger than 2, the proposed methods are far more computationally feasible than MLE. Furthermore, they exhibit much less bias when compared to popular imputation procedures. Analysis of the PCB data uncovered interesting correlational structures.
Keyphrases
  • electronic health record
  • big data
  • machine learning
  • data analysis
  • loop mediated isothermal amplification
  • heavy metals
  • risk assessment
  • human health
  • polycyclic aromatic hydrocarbons