next up previous
Next: References Up: ibsi Previous: Applications

Information-Based Similarity Software

The software may be obtained from here, where you will find ibs.c (the source for the software), a Makefile, two data files (healthy.txt and chf.txt) and a file named ibs.expected. Download all of these files.

If you have a make utility, you can use it to compile and test the software, simply by typing ``make check'' (look in Makefile to see what this command does). Otherwise, compile ibs.c and link it with the C standard math library (needed for the abs and log functions only). For example, if you use the GNU C compiler (recommended), you can do this by:

     gcc -o ibs -O ibs.c -lm

Test the program by running the command:

     ibs 8 healthy.txt chf.txt

If the current directory is not in your PATH, you may need to type the location of ibs, as in

     ./ibs 8 healthy.txt chf.txt

The output should match the contents of ibs.expected. For brief instructions about how to run the program, type its name at a command prompt:


which should produce a message similar to:

    usage: ibs M SERIES1 SERIES2
      where M is the word length (an integer greater than 1), and
      SERIES1 and SERIES2 are one-column text files containing the
      data of the two series that are to be compared.  The output
      is the information-based similarity index of the input series
      evaluated for M-tuples (words of length M).
      For additional information, see

This program reads two text files of numbers, which are interpreted as values of two time series. Within each series, pairs of consecutive values are compared to derive a binary series, which has values that are either 1 (if the second value of the pair was greater than the first) or 0 (otherwise). A user-specified parameter, $m$, determines the length of "words" ($m$-tuples) to be analyzed by this progam.

Within each binary series, all $m$-tuples of consecutive values are treated as "words"; the function counts the occurrences of each of the $2^m$ possible "words" and then derives the word rank order frequency (WROF) list for the series. Finally, it calculates the information-based similarity between the two WROF lists, and outputs this number. Depending on the input series and on the choice of $m$, the value of the index can vary between 0 (completely dissimilar) and 1 (identical).

next up previous
Next: References Up: ibsi Previous: Applications
Albert Yang (