Time-irreversibility test for random-length time series: The matching-time approach applied to DNA.
Raúl Salgado GarcíaPublished in: Chaos (Woodbury, N.Y.) (2022)
In this work, we implement the so-called matching-time estimators for estimating the entropy rate as well as the entropy production rate for symbolic sequences. These estimators are based on recurrence properties of the system, which have been shown to be appropriate for testing irreversibility, especially when the sequences have large correlations or memory. Based on limit theorems for matching times, we derive a maximum likelihood estimator for the entropy rate by assuming that we have a set of moderately short symbolic time series of finite random duration. We show that the proposed estimator has several properties that make it adequate for estimating the entropy rate and entropy production rate (or for testing the irreversibility) when the sample sequences have different lengths, such as the coding sequences of DNA. We test our approach with controlled examples of Markov chains, non-linear chaotic maps, and linear and non-linear autoregressive processes. We also implement our estimators for genomic sequences to show that the degree of irreversibility of coding sequences in human DNA is significantly larger than that for the corresponding non-coding sequences.