Microarray data are used to determine which genes are active in response to a changing cell environment. Genes are "discovered" when they are significantly differentially expressed in the microarray data collected under the differing conditions. In one prevalent approach, all genes are assumed to satisfy a null hypothesis, ℍ 0, of no difference in expression. A false discovery (type 1 error) occurs when ℍ 0 is incorrectly rejected. The quality of a detection algorithm is assessed by estimating its number of false discoveries, 𝔉. Work involving the second-moment modeling of the z-value histogram (representing gene expression differentials) has shown significantly deleterious effects of intergene expression correlation on the estimate of 𝔉. This paper suggests that nonlinear dependencies could likewise be important. With an applied emphasis, this paper extends the "moment framework" by including third-moment skewness corrections in an estimator of 𝔉. This estimator combines observed correlation (corrected for sampling fluctuations) with the information from easily identifiable null cases. Nonlinear-dependence modeling reduces the estimation error relative to that of linear estimation. Third-moment calculations involve empirical densities of 3 × 3 covariance matrices estimated using very few samples. The principle of entropy maximization is employed to connect estimated moments to 𝔉 inference. Model results are tested with BRCA and HIV data sets and with carefully constructed simulations.
Keyphrases
- bioinformatics analysis
- genome wide
- gene expression
- electronic health record
- poor prognosis
- genome wide identification
- small molecule
- big data
- dna methylation
- single cell
- molecular dynamics
- machine learning
- high throughput
- antiretroviral therapy
- hepatitis c virus
- human immunodeficiency virus
- stem cells
- hiv infected
- computed tomography
- hiv positive
- hiv aids
- data analysis
- hiv testing
- men who have sex with men
- transcription factor
- loop mediated isothermal amplification
- quantum dots
- quality improvement
- sensitive detection