next up previous
Next: References Up: ibsi Previous: Applications

Information-Based Similarity Software

The software may be obtained from here, where you will find ibs.c (the source for the software), a Makefile, two data files (healthy.txt and chf.txt) and a file named ibs.expected. Download all of these files.

If you have a make utility, you can use it to compile and test the software, simply by typing ``make check'' (look in Makefile to see what this command does). Otherwise, compile ibs.c and link it with the C standard math library (needed for the abs and log functions only). For example, if you use the GNU C compiler (recommended), you can do this by:

     gcc -o ibs -O ibs.c -lm

Test the program by running the command:

     ibs 8 healthy.txt chf.txt

If the current directory is not in your PATH, you may need to type the location of ibs, as in

     ./ibs 8 healthy.txt chf.txt

The output should match the contents of ibs.expected. For brief instructions about how to run the program, type its name at a command prompt:

    ibs

which should produce a message similar to:

    usage: ibs M SERIES1 SERIES2
      where M is the word length (an integer greater than 1), and
      SERIES1 and SERIES2 are one-column text files containing the
      data of the two series that are to be compared.  The output
      is the information-based similarity index of the input series
      evaluated for M-tuples (words of length M).
 
      For additional information, see
             http://physionet.org/physiotools/ibs/.

This program reads two text files of numbers, which are interpreted as values of two time series. Within each series, pairs of consecutive values are compared to derive a binary series, which has values that are either 1 (if the second value of the pair was greater than the first) or 0 (otherwise). A user-specified parameter, $m$, determines the length of "words" ($m$-tuples) to be analyzed by this progam.

Within each binary series, all $m$-tuples of consecutive values are treated as "words"; the function counts the occurrences of each of the $2^m$ possible "words" and then derives the word rank order frequency (WROF) list for the series. Finally, it calculates the information-based similarity between the two WROF lists, and outputs this number. Depending on the input series and on the choice of $m$, the value of the index can vary between 0 (completely dissimilar) and 1 (identical).


next up previous
Next: References Up: ibsi Previous: Applications
Albert Yang (ccyang@physionet.org)
2004-10-27