We begin with the file base that can be downloaded from Neurotraces. The directory contains the file in compressed form (done with gzip) and uncompressed forms; the uncompressed file is about six times as large as the compressed one. [These files may also be downloaded from PhysioNet; choose base (uncompressed, 584 Kb) or base.gz (compressed, 97 Kb).] To avoid conflict with other files, we will create a directory called Code, where we are going to locate our files. We assume that WFDB has been installed previously.
The file was created by a Nihon Kohden EEG-1100. Similar files can be created with many types of modern neurophysiological equipment. The file is part of a polysomnographic recording, and one of its signals is an electrocardiographic signal. By viewing the recording we select a part and store the result as an ASCII file.
Now the first step is taking a glance into the content of the file:
By inspecting the content of the file we can see that it includes 3400 samples
with 19 channels. Since the sampling interval is 5 ms (equivalent to a sampling
rate of 200 Hz) we have 17 seconds of recording (3400 samples / 200
samples/second = 17 seconds). We also know that the values are expressed in
V. The first lines describe the recording as well as the signals included
in the recording. In our case, the electrocardiographic signal is included in
the channel whose label is X1-X2. Each line has been folded to adapt its
length to the window size. The first line begins with TimePoints..., the
second one begins with C3-A2... the third one begins with 9.56...
and the fourth one begins with -7.35....
Since it is a key point in exchanging information, I would like to discuss a
little bit the use of ASCII files to share neurophysiological recordings:
A recording is represented in ASCII files as a matrix of values. Usually, each
column is a different signal, and the file contains as many columns as signals
are stored; each row represents the samples of these signals at the same time.
Considering these inconveniences, let us say something about their benefits.
In summary, the conversion from and to ASCII files is an important feature of
any format.
Of course, representing a recording as a matrix of rows and columns does not
readily allow a different sampling rate for each signal (to do this, we might
define a code to indicate that a signal was not sampled at the time
corresponding to a specific row), but even so I can foresee that ASCII files
are going to be used for a long time (unless
XML
is quickly and universally
adopted).
Our first task is to create a WFDB signal file from an ASCII file. To do this,
we have a very easy command: wrsamp (something like write
samples). Most WFDB applications show a short summary of how they are used if
we type the name of the program (only) as a command; wrsamp shows us
this description of itself:
A lot of interesting options. We have to indicate the sampling frequency (200
Hz); otherwise the program will assume that it is sampled at 250 Hz.
Electroencephalography or Electromyography amplitudes are usually expressed in
V, so one of the options deserves more comment. Here is a more detailed
description of this option from the WFDB Applications Guide:
Our ASCII file contains sample values in V, so there are 1000 A/D units
per millivolt, and we should therefore specify a gain of 1000. If we do
not consider this point, we will obtain a signal five times bigger (the default
is 200). Another interesting option is -x, which directly modifies the
input. It is an important option when our file contains values smaller than 1
(it is not the case in our signal).
We know that the ECG is contained in column 9 of our ASCII file, base.
(WFDB numbers the columns beginning at 0). But will wrsamp be able to
detect that the first two lines of the file are not data? Let's see.
Wrsamp detected that the first two lines were not properly formatted and emitted a message. We are impatient to see the result
We created two files: a binary signal file, ecg.dat, that stores the
digitized samples of the ECG signal, and a short text header file,
ecg.hea, that contains information that will be needed by any WFDB
application that reads the signal file.
A typical WFDB application reads a record, which is a collection of
files that are all related to the same recording. It is important to
understand that the name of the record we have just created is ecg,
and not the name of either of the files that belong to this record. When
we read these files later on, we will refer to them by the record name,
ecg, and not by the names of the individual files.
We were lucky that wrsamp rejected the first two lines of base.
If we had chosen a different column number, one or both of these lines might
have been accepted, and our signal file would have a spurious sample or two at
its beginning. Looking back at wrsamp's options, we can see that -f
allows us to tell wrsamp where to begin; so in the future, if we know
that there are two header lines in our input file, we will add ``-f 2'' to our
wrsamp command.
At this moment we have a WFDB record containing the ECG of our recording. We
are interested in detecting the heart rate of the signal. We are going to use
the command sqrs
A new file (ecg.qrs) has been added to the ecg record. It is an
annotation file that contains the positions of the QRS complexes. We can read
the annotations
Each line is a QRS complex that has been detected.
Let us recapitulate what we did in this section:
But how can we be confident of the result? WFDB has a very nice tool to view
and edit the result: wave. In the next section we are going to edit the
result by using it.
The WFDB Software Package includes two QRS detectors, named sqrs and
wqrs, and PhysioToolkit offers another one, named ecgpuwave.
All of them are used in a similar way, and all of them create an annotation
file containing the times of the QRS complexes that they detect. Each has
advantages for some types of studies; you can read more about them in the
WFDB Applications Guide.
[j@localhost Code]$ cat base | more
TimePoints=3400 Channels=19 BeginSweep[ms]=0.00 Sampli
ngInterval[ms]=5.000 Bins/uV=1.000
C3-A2 C4-A1 O1-A2 O2-A1 T1-A1 T2-A1 PG1-PG2 T5-P3 P4-T
6 X1-X2 X3-X4 X5-X6 X7-X8 E-X9 E-X10 E-X11 DC01 DC02 D
C03
9.56 21.32 -1.47 11.76 -8.82 -8.09
-26.47 1.47 13.97 -66.18 -6.62 -12.50
-44.12 33.09 -17.65 44.85 -923529.41 -2205.8
8 -376470.59
-7.35 -35.29 -7.35 1.47 -7.35 37.50
-22.06 1.47 19.12 0.00 -3.68 -10.29
-22.06 31.62 -18.38 43.38 -924264.71 -2941.1
8 -375735.29
-4.41 -7.35 -12.50 4.41 -2.94 18.38
-2.94 1.47 -42.65 0.00 -1.47 -2.94
-55.15 31.62 -19.12 43.38 -924264.71 -2941.1
8 -375000.00
2.94 -0.74 -2.21 8.09 -4.41 4.41
32.35 1.47 -63.97 -22.06 0.00 3.68
-121.32 31.62 -16.91 44.12 -924264.71 -3676.4
7 -376470.59
-1.47 -8.09 -4.41 8.82 1.47 9.56
--More--
ASCII files as glue between applications
Creating a WFDB file
[j@localhost Code]$ wrsamp
usage: wrsamp [OPTIONS ...] COLUMN [COLUMN ...]
where COLUMN selects a field to be copied (leftmost field is column 0),
and OPTIONS may include:
-c check that each input line contains the same number of fields
-f N start copying with line N (default: 0)
-F FREQ specify frequency to be written to header file (default: 250)
-G GAIN specify gain to be written to header file (default: 200)
-h print this usage summary
-i FILE read input from FILE (default: standard input)
-l LEN read up to LEN characters in each line (default: 1024)
-o RECORD save output in RECORD.dat, and generate a header file for
RECORD (default: write to standard output in format 16, do
not generate a header file)
-r RSEP interpret RSEP as the input line separator (default: \n)
-s FSEP interpret FSEP as the input field separator (default: space
or tab)
-t N stop copying at line N (default: end of input file)
-x SCALE multiply all inputs by SCALE (default: 1)
-G n:
Specify the gain (in A/D units per millivolt) for the output
signals (default: 200). This option is useful only in
conjunction with -o, since it affects the output header
file only. This option has no effect on the output signal
file. If you wish to rescale samples in the signal file, use -x.
[j@localhost Code]$ wrsamp -i base -F 200 -G 1000 -o ecg 9
wrsamp: line 0, column 9 missing
wrsamp: line 1, column 9 improperly formatted
[j@localhost Code]$ ls ecg*
ecg.dat ecg.hea
[j@localhost Code]$ cat ecg.hea
ecg 1 200 3402
ecg.dat 16 1000 12 0 0 -25694 0 base, column 9
Analyzing the files
[j@localhost Code]$ sqrs -r ecg
[j@localhost Code]$ ls ecg*
ecg.dat ecg.hea ecg.qrs
[j@localhost Code]$ rdann -r ecg -a qrs | more
0:00.110 22 N 0 0 0
0:00.785 157 N 0 0 0
0:01.450 290 N 0 0 0
0:02.115 423 N 0 0 0
0:02.790 558 N 0 0 0
0:03.450 690 N 0 0 0
0:04.110 822 N 0 0 0
0:04.775 955 N 0 0 0
0:05.445 1089...
In summary
Next: Editing the result
Up: Applying PhysioNet tools to
Previous: Introduction: Using PhysioNet tools
Contents
j
2002-12-11