Next: Applications Up: ibsi Previous: Methods

# Example

Here we provide a concise example to demonstrate how to calculate an information-based similarity index between two time series. The following figure illustrates a sample heartbeat interval time series from a healthy subject (left panel) showing complex variability. In contrast, a time series from a CHF subject (right panel) shows less variability. Both sample time series contain 1000 inter-beat intervals. (See RR Intervals, Heart Rate, and HRV Howto for information on how to obtain additional inter-beat interval time series in this format.)

We first map each signal to a binary sequence according to the increment of consecutive inter-beat intervals. Suppose we set the m equal to 8, then there will be = 256 different 8-bit words. We count the occurrences of each 8-bit word, and then sort them by descending frequency. The resulting rank-frequency distribution represents the statistical hierarchy of repetitive patterns of a given time series. For example, the top-ranked 8-bit words correspond to the most frequently occurring patterns in a given heartbeat time series. In contrast, the last ranked word defines the rarest patterns.

Table: is the Shannon entropy.
 8-bit words 00000000 21 113 0.011089 0.002016 0.049919 0.012513 00000001 9 111 0.018145 0.002016 0.072750 0.012513 00000010 118 74 0.003024 0.005040 0.017544 0.026665 00000011 3 195 0.029234 0.000000 0.103267 0.000000 00000100 45 35 0.006048 0.009073 0.030895 0.042664 00000101 117 86 0.003024 0.004032 0.017544 0.022232 00000110 47 161 0.006048 0.001008 0.030895 0.006955 00000111 1 194 0.037298 0.000000 0.122667 0.000000 00001000 80 24 0.004032 0.012097 0.022232 0.053405 00001001 83 73 0.004032 0.005040 0.022232 0.026665 … … … … … … …

The rank order difference between two time series can be visualized by plotting the rank number of each 8-bit word in the first time series against that of the second time series. The dashed diagonal line indicates the case where the rank order of words for both time series is identical.

As demonstrated by the above rank order comparison map, the ``distance'' (or dissimilarity) between any two time series can be quantified by measuring the scatter of these points from the diagonal line in the rank order comparison plot. By applying Eq.1 to the rank-order frequency list obtained from the sample time series, we obtained an information-based similarity index equal to 0.412725. Using the example data files provided with the ibs software, this result may be reproduced by running the command

```    ibs 8 healthy.txt chf.txt
```

Next: Applications Up: ibsi Previous: Methods
Albert Yang (ccyang@physionet.org)
2004-10-27