Generalized Multiscale Entropy Analysis 1.0.0

File: <base>/gmse-tutorial.tex (31,594 bytes)
\documentclass[12pt]{article}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage[unicode]{hyperref}
\usepackage{rawfonts}
\usepackage{times}
\usepackage{framed}
\usepackage{graphicx}
\usepackage{xcolor}
\usepackage[section]{placeins}
\usepackage{bm}
\usepackage[font=small]{caption}
\usepackage[raggedright]{titlesec}

\topmargin -0.5in
\hoffset -1in
\textheight 656pt
\footskip 48pt
\textwidth 7in
\oddsidemargin 0.75in
\renewcommand{\topfraction}{0.9}
\renewcommand{\textfraction}{0.1}

\usepackage{epsfig}
\usepackage{html}
\newcommand{\htmlexternallink}[2]{\htmladdnormallink{#1}{#2}}
\begin{htmlonly}
\newcommand{\htmlexternallink}[2]{%
 \htmladdnormallink{#1}{#2" target="other}}
\end{htmlonly}

\DeclareGraphicsExtensions{.pdf,.png}

\newcommand{\MSEmean}{MSE$_{\mu}$}
\newcommand{\MSEsd}{MSE$_{\sigma}$}
\newcommand{\MSEvar}{MSE$_{\sigma^2}$}
\newcommand{\MSEmad}{MSE$_{\mathit{MAD}}$}
\newcommand{\MSEMEAN}{\texorpdfstring{MSE$_{\bm\mu}$}{MSEμ}}
\newcommand{\MSESD}{\texorpdfstring{MSE$_{\bm\sigma}$}{MSEσ}}
\newcommand{\MSEVAR}{\texorpdfstring{MSE$_{\bm{\sigma^2}}$}{MSEσ2}}
\newcommand{\MSEMAD}{\texorpdfstring{MSE$_{\bm{\mathit{MAD}}}$}{MSEmad}}
\newcommand{\titlemath}[1]{\texorpdfstring{$\bm{#1}$}{#1}}
\newcommand{\apos}{'}
\newcommand{\qq}[1]{``#1''}
\newcommand{\FramedBox}[1]{%
  \begin{center}
    \begin{minipage}{0.9\textwidth}
      \begin{framed}
        #1
      \end{framed}
    \end{minipage}
  \end{center}
}

\begin{htmlonly}
\newcommand{\MSEmean}{MSE\begin{rawhtml}<sub><i>&mu;</i></sub>\end{rawhtml}}
\newcommand{\MSEsd}{MSE\begin{rawhtml}<sub><i>&sigma;</i></sub>\end{rawhtml}}
\newcommand{\MSEvar}{MSE\begin{rawhtml}<sub><i>&sigma;</i><sup>2</sup></sub>\end{rawhtml}}
\newcommand{\MSEmad}{MSE\begin{rawhtml}<sub>MAD</sub>\end{rawhtml}}
\newcommand{\MSEMEAN}{\MSEmean{}}
\newcommand{\MSESD}{\MSEsd{}}
\newcommand{\MSEVAR}{\MSEvar{}}
\newcommand{\MSEMAD}{\MSEmad{}}
\newcommand{\titlemath}[1]{$#1$}
\newcommand{\apos}{\begin{rawhtml}&rsquo;\end{rawhtml}}
\newcommand{\qq}[1]{\begin{rawhtml}&ldquo;\end{rawhtml}{}#1{}\begin{rawhtml}&rdquo;\end{rawhtml}}
\newcommand{\FramedBox}[1]{%
  \begin{rawhtml}<div style="border: 1px solid black; padding: 0.75em; margin: 1em">\end{rawhtml}
    #1
  \begin{rawhtml}</div>\end{rawhtml}
}
\newcommand{\le}{ $\begin{rawhtml}&le;\end{rawhtml}$ }
\newcommand{\ne}{ $\begin{rawhtml}&ne;\end{rawhtml}$ }
\newcommand{\approx}{ $\begin{rawhtml}&approx;\end{rawhtml}$ }
\newcommand{\vdots}{ $\begin{rawhtml}&vellip;\end{rawhtml}$ }
\newcommand{\colorbox}[2]{
  \begin{rawhtml}<span style="background-color: \end{rawhtml} #1 %
  \begin{rawhtml}">\end{rawhtml} %
  #2 %
  \begin{rawhtml}</span>\end{rawhtml}}
\end{htmlonly}

\renewcommand{\refname}{Selected Bibliography}

\begin{document}

%\preprint{}

\title{Generalized Multiscale Entropy (GMSE) Analysis: Quantifying the Structure of Time Series\apos{} Volatility}

\author{Madalena D. Costa and Ary L. Goldberger\\
Beth Israel Deaconess Medical Center, Boston, USA}

\date{}

\maketitle

\noindent
A detailed description of the generalized multiscale entropy algorithm
(GMSE) and its application can be found in:
\begin{itemize}
\item
Costa M., Goldberger A.L.
\htmladdnormallink{Generalized Multiscale Entropy Analysis: Application to Quantifying the Complex Volatility of Human Heartbeat Time Series}{https://www.mdpi.com/1099-4300/17/3/1197}. \emph{Entropy} 2015;\textbf{17}:1197-1203.
\end{itemize}
Please cite this publication when referencing this material, and also
include the standard citation for PhysioNet:
\begin{itemize}
\item
Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PCh, Mark RG, Mietus JE, Moody GB,
Peng C-K, Stanley HE.
\htmladdnormallink{PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals}{http://circ.ahajournals.org/content/101/23/e215.full}.
\emph{Circulation} \textbf{101}(23):e215-e220
\end{itemize}
\begin{htmlonly}
A \htmladdnormallink{PDF}{tutorial.pdf} version of this tutorial is
also available.
\end{htmlonly}
The software described in this tutorial is available
\htmladdnormallink{here}{https://physionet.org/physiotools/gmse/gmse.c}.
\newpage

\section{Background}

The original multiscale entropy (MSE)
method~\cite{Costa2002,Costa2005} quantifies the complexity of the
temporal changes in one specific feature of a time series: the local
mean values of the fluctuations. The method comprises two steps: 1)
coarse-graining of the original time series, and 2) quantification of
the degree of irregularity of the coarse-grained (C-G) time series
using an entropy measure such as sample entropy
(SampEn)~\cite{Richman2000}.

The generalized multiscale entropy (GMSE) method~\cite{Costa2015}
quantifies the complexity of the dynamics of a set of features of the
time series related to local sample moments. The method differs from
the original MSE in the way that the C-G time series are computed. In
the original method, the mean value is used to derive a set of
representations of the original signal at different levels of
resolution. This choice implies that information encoded in features
related to higher moments is discarded. The coarse-graining procedure
in the generalized algorithm extracts statistical features such as the
variance (standard deviation [SD] or mean absolute deviation [MAD]),
skewness, kurtosis, etc, over a range of time scales. This tutorial
focuses primarily on the quantification of the information encoded in
fluctuations in standard deviation.

We use a subscript after MSE to designate the type of coarse-graining
employed. Specifically, \MSEmean{}, \MSEsd{} and \MSEvar{} refer to
MSE computed for mean, SD and variance C-G time series, respectively.
\smallskip

\FramedBox{
  \begin{center}
    \textbf{CONCEPTUAL FRAMEWORK OF MULTISCALE ENTROPY ALGORITHMS}
  \end{center}
  For a dynamical property of interest, such as mean or standard
  deviation, MSE algorithms comprise two sequential procedures:
  \begin{enumerate}
  \item
    Extracting representations of the systems dynamics at different
    levels of resolution, i.e., deriving the coarse-grained time
    series and
  \item
    Assessing the degree of unpredictability of the C-G time series
    using an entropy measure, e.g., sample entropy.%
  \end{enumerate}
  \vspace{-6pt}
}

As noted above, in the original MSE method (\MSEmean{}) the property of
interest is the local mean value. The C-G time series capture
fluctuations in local mean value for pre-selected time scales. In the
original application, such C-G time series were obtained by dividing
the original time series into non-overlapping segments of equal length
and calculating the mean value of each of them. However, other
approaches for extracting the same \qq{type} of information (local mean)
can also be considered, including low pass filtering the original time
series using Fourier analysis, among others (e.g., the empirical mode
decomposition) methods.

The GMSE method expands the original MSE framework to other properties
of a signal. Here, we address the quantification of information
encoded in the fluctuations of the \qq{volatility} of the signal.

Figure 1 shows the interbeat interval (RR) time
series from a healthy subject, simulated 1/f noise and their SD C-G
time series for scales 5 and 20. The fluctuation patterns of the
physiologic C-G time series appear more unpredictable, \qq{less
uniform} and more \qq{bursty,} than those of simulated 1/f noise.

\begin{figure}[tbh]
  \begin{center}
  \includegraphics[width=0.9\textwidth]{Tutorial_Fig1}
  \end{center}
  \caption{Original (top panels) and standard deviation (SD)
    coarse-grained time series (TS), obtained using a moving window
    comprising five (middle panels) and 20 (bottom panels) data
    points, of the cardiac interbeat (RR) interval from a healthy
    subject (left panels) and a simulation of long-range correlated
    (1/f) noise (right panels).}
\end{figure}

Figure 2 shows \MSEsd{} (top panels) and \MSEvar{} (bottom panels)
analyses of physiologic and simulated long-range correlated (1/f)
noise time series. The physiologic time series are the RR intervals
(left panels) from healthy young to middle-aged ($\le$ 50 years) and
healthy older ($>$ 50 years) subjects and patients with chronic
(congestive) heart failure (CHF). The time series are available on
PhysioNet: i) 26 healthy young subjects and 46 healthy older subjects
(\htmladdnormallink{nsrdb}{https://physionet.org/physiobank/database/nsrdb/},
\htmladdnormallink{nsr2db}{https://physionet.org/physiobank/database/nsr2db/})
ii) 32 patients with CHF class III and IV
(\htmladdnormallink{chfdb}{https://physionet.org/physiobank/database/chfdb/},
\htmladdnormallink{chf2db}{https://physionet.org/physiobank/database/chf2db/}).

\begin{figure}[tbh]
  \begin{center}
  \includegraphics[width=0.9\textwidth]{Tutorial_Fig2}
  \end{center}
  \caption{\MSEsd{} (top panels) and \MSEvar{} (bottom panels)
    analyses of RR interval time series (left panels) from healthy
    young to middle-aged ($\le$ 50 years), healthy older ($>$ 50
    years) subjects and patients with congestive (chronic) heart
    failure (CHF), and of simulated white and 1/f noise (right panels)
    time series. Symbols and error bars represent population mean and
    standard error values.  The $r$ value was 20\% of the SD of
    the C-G time series derived using a moving window comprising five
    data points (shortest scale). Physiologic and simulated time
    series comprised 50,000 and 131,072 ($2^{17}$) data points,
    respectively. The sample entropy parameter $m$ was 2.}
\end{figure}

Entropy over the pre-selected range of scales was higher for 1/f than
white noise, both for SD and variance C-G time series.  With respect
to the RR interval time series, entropy values were on average higher
for the group of healthy young subjects than for the group of healthy
older subjects. In addition, the entropy values for the group of CHF
patients were, on average, the lowest.  The results were qualitatively
the same for SD and variance C-G time series.

These findings are consistent with those derived from traditional
(mean C-G) MSE analyses. They indicate that: 1) 1/f noise processes
are more complex than uncorrelated random ones; 2) the complexity of
heart rate dynamics degrades with aging and heart disease.

\section{Considerations regarding the selection of the parameter \titlemath{r} for the calculation of sample entropy}

In the sample entropy algorithm, the parameter $r$ is used to
determine whether two data points, $x_i$ and $x_j$, are
distinguishable or not. If $|x_i - x_j| {}\le{} r$, then $x_i$ and $x_j$ are
indistinguishable. Otherwise, they are \qq{seen} as two different data
points. In the \MSEmean{} algorithm, the $r$ value is traditionally
chosen as a percentage of the SD of the original time series,
typically a value between 15 and 20\%. An advantage of choosing $r$ in
this way is the fact that two time series with different amplitudes
but equal correlation properties are guaranteed to have the same
sample entropy and the same MSE values. In fact, calculating sample
entropy with an $r$ value that is a \emph{percentage} of the time
series\apos{} SD (e.g., 20\%) is equivalent to calculating sample entropy
with a \emph{fixed} $r$ value (0.2) of a previously normalized time
series.

An important observation regarding the choice of $r$ as a percentage
of a time series\apos{} SD is the fact that two RR interval data points,
e.g., 625 and 633 ms, may be indistinguishable when analyzing a given
time series and distinguishable when analyzing another one. Consider
the time series A and B with SDs of 38 ms and 42 ms, respectively. If
$r$ = 20\% of the time series\apos{} SD, then $r$ = 7.6 ms and $r$ = 8.4 ms,
for A and B, respectively. Therefore, in one case, the RR intervals
625 and 633 ms are \qq{seen} as different (since $|625 - 633| = 8.0 > 7.6$),
and in the other case, as indistinguishable, i.e., below the accepted
level of noise (since $|625 - 633| < 8.4$). If one is interested in
quantifying entropy of two time series at the same level of
\qq{resolution,} then one should choose a \emph{fixed} (i.e., not
dependent on SD) $r$ value. However, in doing so, the following
consideration should be kept in mind.

Let us consider two time series $A$ and $B$ taking values from the sets
$\{{}a, b\}$ and $\{{}a, b, c\}$, respectively. Assume that both time
series are uncorrelated noise. For time series $A$, the probabilities of
$a$ and $b$ are both 1/2; and for time series $B$, the probabilities of $a$, $b$
and $c$ are 1/3. To simplify the presentation, here we use Shannon
entropy. For time series $A$, the entropy is:
\[
-p(a)\ln[p(a)] - p(b)\ln[p(b)] = -2\cdot 1/2 \cdot \ln(1/2) = -\ln(1/2)
\]
For time series $B$, the entropy is:
\[
-p(a)\ln[p(a)] - p(b)\ln[p(b)] - p(c)\ln[p(c)] = -3 \cdot 1/3 \cdot\ln(1/3)=-\ln(1/3)
\]
The entropy for time series $A$ is smaller than for time series $B$
simply because the size of the alphabet of time series $A$ is smaller
(two symbols, $a$ and $b$) than that of time series $B$ (three
symbols, $a$, $b$ and $c$).

In conclusion, time series with a larger alphabet are more entropic
than those with a smaller alphabet and identical correlation
properties. Thus, when analyzing RR intervals time series with a fixed
$r$ value, if one finds that a given time series is more entropic than
another, one cannot be sure what the source of the difference is.
Observed differences in entropy could be due to differences in the
degree of randomness of the time series, differences in their range of
values (larger/smaller alphabets) or a combination of the two.

Note that the two approaches discussed for choosing the $r$ value are
both justifiable since they provide complementary information.

Our first application of MSE using a fixed $r$ value was in a project
whose objective was to help forecast the need for lifesaving
interventions based solely on 15-min ECG signals: Cancio LC,
Batchinsky AI, Baker WL, et al. \htmladdnormallink{Combat casualties undergoing lifesaving interventions have decreased heart rate complexity at multiple time scales}{https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4018756/}.
J Crit Care. 2013;28(6):1093-8.

In most studies employing \MSEmean{}, the $r$ value is set to a
percentage of the SD of the C-G time series for the smallest scale
included in the analysis. Typically, the smallest scale is scale
one. Thus, the $r$ value is a percentage of the original time series\apos{}
SD. This $r$ value is then used to calculate the sample entropy for
all other C-G time series. A similar approach is also recommended for
\MSEsd{}, \MSEvar{} and \MSEmad{}. However, in these cases, the first scale to be
analyzed is not scale one. Typically, one would choose to start at
scale five or above since the coarse-graining with windows with fewer
than five data points may not retain important information pertaining
to the degree of local volatility.

The results presented in Figure 2 for both SD and variance C-G time
series follow this approach: $r$ is 20\% of the SD of the C-G time
series for scale 5.

\MSEsd{}, \MSEvar{} and \MSEmad{} analyses can also we performed using
an $r$ value that is a percentage of the original time series\apos{}
SD. However, for the analysis of RR intervals time series, a value
around 20\% is not adequate. Instead, values below 1\% are likely more
suitable. We illustrate the issue using the RR interval time series
shown in Fig. 1. The SD of this time series is 0.133 s. Twenty percent
of this value is 0.027 s. Two data points are distinguishable if the
difference between them is larger than 0.027 s. Consider, the SD C-G
time series for scale 5 and select its median value, 0.0255 s. Only
14\% of the data points in this SD C-G time series satisfy the
condition: $|x_i - 0.0255| > 0.027$. In summary, the $r$ value derived
from the original time series and used for the analysis of SD C-G time
series is so large that 86\% of the points around the SD C-G median
value are indistinguishable.

Independent of which approach one chooses to follow ($r$ as a
percentage of the CG time series for the first scale analyzed or as a
fixed value), an important consideration is whether or not the chosen
$r$ value is too restrictive or not restrictive enough. The GMSE
algorithm outputs the number of matches with $m$ and $m+1$
components. As a \qq{rule of thumb,} if the number of matches is less
than 50 for the largest scale analyzed, then the $r$ value should be
increased.

\section{\MSEMEAN{} and \MSEMAD{} analysis of RR interval time series from healthy young and older individuals and patients with CHF using a fixed \titlemath{r} value}

We show the MSE analysis (Fig. 3) of RR interval time series using i)
the mean and ii) the mean absolute deviation
($\mathit{MAD} = \sum |x_i-\bar{x}| / N$)
metrics for coarse-graining the time series. The parameter $r$ is
fixed at 8 ms. (The mean absolute difference is another measure of
local variability.) The results indicate that healthy young subjects
have the highest dynamical complexity, when considering both
fluctuations in the mean and the mean absolute deviation
(dispersion). The degrees of separation among the groups obtained with
\MSEmean{} and \MSEmad{} are qualitatively comparable. (The area under
the curve (AUC) of younger versus older is 0.85 and 0.88 for
\MSEmean{} and \MSEmad{}, respectively. The AUC of healthy older
individual versus patients with CHF is 0.85 and 0.90 for \MSEmean{}
and \MSEmad{}, respectively.

\begin{figure}[tbh]
  \begin{center}
  \includegraphics[width=0.9\textwidth]{Tutorial_Fig3}
  \end{center}
  \caption{MSE analysis of mean (left) and absolute mean deviation
    (right) RR interval time series from healthy young and older
    subjects and patients with congestive heart failure, using a fixed
    $r$ (8 ms) value. The sample entropy parameter $m$ was 2. Time
    series length was 50,000 data points.}
\end{figure}

\section{Effects of long-term trends and outliers on \MSEMEAN{} and \MSESD{} \mbox{analyses}}

Long-term trends change the overall but not the local SD/variance of a
time series. As a consequence trends affect mostly \MSEmean{} not
\MSEsd{} when the $r$ value is calculated as a percentage of the time
series\apos{} SD. \MSEmean{} and \MSEsd{} values are much less effected by
trends in implementations that use a fixed $r$ value.

Figure 4 illustrates the effects that linear trends superimposed on
normalized 1/f noise fluctuations have on \MSEmean{} and \MSEsd{}
values. We considered trends obtained by the concatenation of two
linear segments. In one case (red line) the values of the function
increase linearly between 4 and 8 for the first 30,000 and then
decrease linearly from 8 to to 4 over the following 20,000 data
points. In the other case (green line), the rates of increase/decrease
were doubled. The superimposition of each of these trends on a 1/f
noise time series resulted in signals A and B shown on the first and
second panels, respectively.

\begin{figure}[b!]
  \begin{center}
  \includegraphics[width=0.9\textwidth]{Tutorial_Fig4}
  \end{center}
  \caption{Effects of long-term trend on \MSEmean{} and \MSEsd{}
    analyses of simulated 1/f noise time series, using variable and
    fixed $r$ values.}
\end{figure}

The SD of the original 1/f noise time series is one. The SDs of time
series A and B are 1.3 (30\% larger than the SD of the original
series) and 2.3 (130\% larger than the SD of the original series),
respectively.

In \MSEmean{} analyses using the 20\% of SD criterion, the absolute
values of $r$ are 0.2, 0.26 and 0.46 for the original, time series A
and B, respectively. The increase in the absolute value of $r$ due to
the trends results in a higher number of matches and consequently in
lower entropy values. Specifically, for scale 1, in the case of 1/f
noise, the number of matches with $m$ = 2 and $m$ = 3 were 28,348,070
and 5,827,479, respectively. For time series A (B), the number of
matches with $m$ = 2 and $m$ = 3 were 38,606,695 (60,511,987) and
10,973,294 (27,887,816), respectively.  The effects of trends on
\MSEmean{} can be obviated by using a fixed $r$ value. However, when
using $r$ = 20\% of SD, a barely noticeable trend can significantly
change the values of entropy.

In \MSEsd{} analysis, the $r$ value is a percentage of the SD of the
SD coarse-grained time series obtained with a window of 5 data
points. Since the slow trend has only a small effect on the SD
coarse-grained time series, the changes in the entropy values are
negligible.

Figure 5 shows the results of \MSEmean{} and \MSEsd{} analyses of the
RR interval time series from the groups of healthy young and older
subjects and patients with CHF, using $r$ = 20\% SD. The first 50,000
data points of each recording ($\approx$ 14-hours) were selected for
analysis independent of the starting time (not available in most of
these cases). As such, the time series may include both awake and
sleep periods.

\begin{figure}[tbh]
  \begin{center}
  \includegraphics[width=0.9\textwidth]{Tutorial_Fig5}
  \end{center}
  \caption{\MSEmean{} and \MSEsd{} analyses of RR interval time series
    from healthy young and older subjects and patients with congestive
    heart failure, using $r$ = 20\% SD. The sample entropy parameter m
    was 2. Time series length was 50,000 data points.}
\end{figure}

Healthy individuals are more likely to have more pronounced circadian
variations than patients with CHF. Thus, the RR interval time series
from the former group are more likely to exhibit trends of higher
amplitude than the latter. These trends can justify the apparent
contradictory \MSEmean{} results that indicate higher dynamical
complexity for the CHF than the healthy older group. Such a conjecture
is supported by the \MSEmean{} results obtained using a fixed $r$
value as well as the those of \MSEsd{} analyses using both fixed and
variable $r$ values.  These findings highlight the need to analyze the
data using different approaches in order to gain a deeper
understanding of the properties of the dynamics and minimize the
impact of known and unsuspected confounders.

The presence of outliers can also significantly affect \MSEsd{},
\MSEvar{} and \MSEmad{} analyses. For this reason it is important to
filter the time series prior to performing these analyses. The results
presented here of the RR interval time series analyses were obtained
using the filter described below. Alternatively, the use of a metric
such as median absolute deviation for coarse-graining (not implemented
here) could be considered.

\begin{figure}[tbh]
  \begin{center}
  \includegraphics[width=0.9\textwidth]{Tutorial_Fig6}
  \end{center}
  \caption{Effects of outliers on \MSEmean{} and \MSEsd{} analyses of
    simulated 1/f noise time series.}
\end{figure}

\newpage %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Software for GMSE analysis}

Download \htmladdnormallink{\textsf{gmse.c}}{https://physionet.org/physiotools/gmse/gmse.c},
the C language source for a program that performs multiscale entropy
analysis. The program can be compiled using any ANSI/ISO C compiler,
and should be linked to the C math library (it uses only the
\textsf{sqrt} function from that library). For example, using the
freely available GNU C compiler, mse.c can be compiled into an
executable mse by the command:

\begin{verbatim}
    gcc -o gmse -O gmse.c -lm
\end{verbatim}

\subsection*{Preparing data for GMSE analysis}

The program to compute GMSE processes text formatted input files with
either one or two columns. Files with one column can be a list of RR
intervals or any time series of interest. Files with two columns
should be the time of occurrence of an R wave, $t(i)$, and the
corresponding RR intervals, $t(i+1) - t(i)$.  This type of RR interval
file can be obtained from a beat annotation file using
\htmladdnormallink{\textsf{ann2rr}}{https://www.physionet.org/physiotools/wag/ann2rr-1.htm}:
\begin{verbatim}
    ann2rr -r nsrdb/16265 -a atr -c -p N -P N  -i s3  -V s3 > 16265.RR
\end{verbatim}

\noindent
Example of a two-column input file (16265.RR):
\begin{center}
\begin{tabular}{cc}
$t(i)$ & $RR(i) = t(i+1) - t(i)$ \\
\hline
0.406 & 0.602 \\
1.008 & 0.609 \\
1.617 & 0.602 \\
2.219 & 0.625 \\
2.844 & 0.609 \\
3.453 & 0.625 \\
4.078 & 0.594 \\
4.672 & 0.602 \\
$\vdots$ & $\vdots$ \\
300.914 & 0.547 \\
\colorbox{yellow}{301.461} & \colorbox{yellow}{0.539} \\
\colorbox{yellow}{302.539} & 0.547 \\
303.086 & 0.539 \\
$\vdots$ & $\vdots$ \\
\end{tabular}
\end{center}

\noindent
The two-column file, 16265.RR can be filtered using \textsf{filt}
from the \htmladdnormallink{HRV Toolkit}{https://physionet.org/tutorials/hrv-toolkit/}:
\begin{verbatim}
    filt 0.3 20 -x 0.3 2.0
\end{verbatim}

\noindent
In this example, all points below 0.3 and above 2.0 are first
removed. Next, to decide one whether the data point $x_i$ should be
filtered out or kept, the mean value of the 20 data points to the left
and the 20 data points to the right is calculated. The central point
$x_i$ is deleted if it is below of above 30\% (0.3) of the computed
average value. The process is repeated for all data points.

When given two column input files (time and RR intervals), the GMSE
program excludes RR intervals that are not consecutive. In the example
above, there is an interruption in the time series between 301.461 and
302.539 s (values highlighted in yellow):
302.539 $-$ 301.461 $=$ 1.078 $\ne$ 0.539.
Thus, the RR interval 0.539 will be excluded from the
analysis. In addition, no vectors comprising the intervals immediately
preceding (0.547) and following (0.547) the interruption will be
considered.

\subsection*{Summary of options and default values for GMSE}

\begin{description}
\item[--n:]
  largest scale.  Default: 20.
\item[--a:]
  difference between consecutive scales.  Default: 1.
\item[--c:]
  coarse-graining method: 1) mean; 2) standard deviation (SD); 3)
  variance; 4) mean absolute deviation.  Default: SD.
\item[--x:]
  sample entropy noise tolerance value: fixed. Two data points, $u_i$, $u_j$
  match if $|u_i - u_j| {}\le{} x$
\item[--r:]
  sample entropy noise tolerance value: a number between 0 and 1 that
  represents a percent. Two data points, $u_i$, $u_j$ match if
  $|u_i - u_j| {}\le{} r * SD$. Default: $r$ = 0.15. For mean
  coarse-graining analysis (traditional MSE), SD is the standard
  deviation of the original (scale 1) time series. For all other
  cases, SD is the standard deviation of scale 4 (default)
  coarse-grained time series.
\item[--m:]
  sample entropy vector length.  Default: $m$ = 2.
\item[--i:]
  starting data point.  Default: 0.
\item[--I:]
  ending data point.  Default: end of file.
\end{description}

\subsection*{Examples of command lines}

\begin{enumerate}
\item
\verb|gmse -i 0 -I 50000 -c 1 < 16265.RR-filt > output|

Coarse-graining method: mean (traditional MSE). Sample entropy
parameters values $m$ = 2, $r$ = 0.15 (15\% of SD of the original time
series). The first 50,000 data points are considered.

The output values are:

{\latex{\small}\begin{tabular}{llll}
Scale	& SampEn	& $m_3$/$m_2$		& $r * SD$ \\
\hline
1	& 1.1897	& 7253987/23837718	& 0.013203 \\
2	& 1.3032	& 1600494/5891466	& 0.013203 \\
3	& 1.4956	& 473897/2114637	& 0.013203 \\
4	& 1.6527	& 179051/934800		& 0.013203 \\
5	& 1.6506	& 117176/610500		& 0.013203 \\
6	& 1.6294	& 85533/436301		& 0.013203 \\
7	& 1.6505	& 57324/298632		& 0.013203 \\
8	& 1.6164	& 47163/237466		& 0.013203 \\
9	& 1.5821	& 40749/198244		& 0.013203 \\
10	& 1.5895	& 31764/155689		& 0.013203 \\
11	& 1.5332	& 28626/132628		& 0.013203 \\
12	& 1.5297	& 24666/113877		& 0.013203 \\
13	& 1.5189	& 21291/97245		& 0.013203 \\
14	& 1.4899	& 18804/83427		& 0.013203 \\
15	& 1.4710	& 16822/73235		& 0.013203 \\
16	& 1.4687	& 14703/63866		& 0.013203 \\
17	& 1.4734	& 12977/56633		& 0.013203 \\
18	& 1.4482	& 11873/50527		& 0.013203 \\
19	& 1.4621	& 9992/43114		& 0.013203 \\
20	& 1.4319	& 9591/40155		& 0.013203 \\
\end{tabular}}

\item
\verb|gmse -i 0 -I 50000 -r 0.2 -c 2 < 16265.RR-filt > output|

Coarse-graining method: SD. Sample entropy parameters $m$ = 2,
$r$ = 0.20 (20\% of the coarse-grained time series for scale 5). This is
the command line used to derive the results shown on the left panel of
Fig. 2 and on the right panel of Fig. 5.

The output values are:

{\latex{\small}\begin{tabular}{llll}
Scale	& SampEn	& $m_3$/$m_2$	& $r * SD$ \\
\hline
5	& 1.4648	& 307814/1331797	& 0.003835 \\
6	& 1.5233	& 168245/771816	& 0.003835 \\
7	& 1.5829	& 99565/484806	& 0.003835 \\
8	& 1.6365	& 62961/323456	& 0.003835 \\
9	& 1.7058	& 40966/225554	& 0.003835 \\
10	& 1.7258	& 30256/169946	& 0.003835 \\
11	& 1.7605	& 21967/127745	& 0.003835 \\
12	& 1.8010	& 16915/102437	& 0.003835 \\
13	& 1.8103	& 13668/83541	& 0.003835 \\
14	& 1.8485	& 10580/67189	& 0.003835 \\
15	& 1.8309	& 9132/56979	& 0.003835 \\
16	& 1.8727	& 7565/49216	& 0.003835 \\
17	& 1.8786	& 6460/42275	& 0.003835 \\
18	& 1.8836	& 5632/37041	& 0.003835 \\
19	& 1.8833	& 4867/32000	& 0.003835 \\
20	& 1.9080	& 4180/28171	& 0.003835 \\
\end{tabular}}

\item
\verb|gmse -i 0 -I 50000 -x 0.008 -c 4 < 16265.RR-filt > output|

Mean absolute deviation is the chosen metric for
coarse-graining. Sample entropy parameters $m$ = 2, $r$ = 0.008 (fixed,
not a \% of SD).  This is the command line used to derive the results
plotted on the right side panel of Fig. 3.

The output values are:

{\latex{\small}\begin{tabular}{llll}
Scale	& SampEn	& $m_3$/$m_2$	& $r * SD$ \\
\hline
5	& 0.7032	& 3594655/7261796	& 0.008000 \\
6	& 0.7591	& 2008629/4291038	& 0.008000 \\
7	& 0.8139	& 1186303/2677237	& 0.008000 \\
8	& 0.8624	& 767179/1817262	& 0.008000 \\
9	& 0.9165	& 506521/1266507	& 0.008000 \\
10	& 0.9311	& 380264/964870	& 0.008000 \\
11	& 0.9756	& 273555/725658	& 0.008000 \\
12	& 0.9891	& 217627/585165	& 0.008000 \\
13	& 0.9915	& 177947/479594	& 0.008000 \\
14	& 1.0232	& 140291/390320	& 0.008000 \\
15	& 1.0296	& 117161/328049	& 0.008000 \\
16	& 1.0236	& 104222/290078	& 0.008000 \\
17	& 1.0395	& 89258/252412	& 0.008000 \\
18	& 1.0468	& 77024/219414	& 0.008000 \\
19	& 1.0332	& 68169/191564	& 0.008000 \\
20	& 1.0375	& 60541/170863	& 0.008000 \\
\end{tabular}}

\end{enumerate}

\begin{thebibliography}{14}
\bibitem{Costa2002}
Costa M, Goldberger AL, Peng C-K.
\htmladdnormallink{Multiscale entropy analysis of complex physiologic time series}{https://www.physionet.org/physiotools/mse/papers/prl-2002.pdf}.
\emph{Phys Rev Lett} 2002;89:68102.

\bibitem{Costa2005}
Costa M, Goldberger AL, Peng C-K.
\htmladdnormallink{Multiscale entropy analysis of biological signal}{https://www.physionet.org/physiotools/mse/papers/pre-2005.pdf}.
\emph{Phys Rev E} 2005;71:021906.

\bibitem{Richman2000}
Richman JS, Moorman JR.
Physiological time-series analysis using approximate entropy and sample entropy.
\emph{Am J Physiol Heart Circ Physiol.} 2000;278:H2039-49.

\bibitem{Costa2015}
Costa MD, Goldberger AL.
\htmladdnormallink{Generalized multiscale entropy analysis: Application to quantifying the complex volatility of human heartbeat time series}{https://www.mdpi.com/1099-4300/17/3/1197}.
\emph{Entropy} 2015;17:1197-203.

\bibitem{Cancio2013}
Cancio LC, Batchinsky AI, Baker WL, Necsoiu C, Salinas J, Goldberger AL, Costa MD.
\htmladdnormallink{Combat casualties undergoing lifesaving interventions have decreased heart rate complexity at multiple time scales}{https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4018756/}.
\emph{J Crit Care} 2013;28:1093-8.
\end{thebibliography}

\end{document}