Next: Evaluation Protocols
Up: Introduction
Previous: Introduction
Several databases of ECG recordings are generally available
for evaluating ECG analyzers. They serve several important needs:
- They contain representative signals. Wide variations in
ECG characteristics among subjects severely limit the value of
synthesized waveforms for testing purposes. Realistic tests of ECG
analyzers require large sets of ``real-world'' signals.
- They contain rarely observed but clinically significant
signals. Although it is not particularly difficult to obtain
recordings of common ECG abnormalities, often those that are most
significant are rarely recorded. Both developers and evaluators of
ECG analyzers need examples of such recordings.
- They contain standard signals. System comparisons
are meaningless unless performance is measured using the same test
data in each case, since performance is so strongly data-dependent.
- They contain annotated signals. Typically, each QRS
complex has been manually annotated by two or more cardiologists
working independently. The reference annotations produced
as a result serve as a ``gold standard'' against which a device's
analysis can be compared quantitatively.
- They contain digitized, computer-readable signals. It is
therefore possible to perform a fully automated, strictly reproducible
test in the digital domain if desired, allowing one to establish with
certainty the effects of algorithm modifications on performance.
At present, the following ECG databases are available:
- AHA DB: The American Heart Association Database for
Evaluation of Ventricular Arrhythmia Detectors (80 records, 35 minutes
each)
- MIT DB: The Massachusetts Institute of Technology-Beth
Israel Hospital Arrhythmia Database (48 records, 30 minutes each)
- ESC DB: The European Society of Cardiology ST-T
Database (90 records, two hours each)
- NST DB: The Noise Stress Test Database (12 records, 30
minutes each; supplied with the MIT DB)
- CU DB: The Creighton University Sustained Ventricular
Arrhythmia Database (35 records, 8 minutes each; supplied on the
second edition of the MIT DB CD-ROM)
Each of these databases represents a very substantial effort by many
workers; in particular, the AHA, MIT, and ESC databases each required
more than five years of sustained effort by large teams of researchers
and clinicians from many institutions. Nevertheless, it should be
recognized that even these databases do not fully represent the
variety of ``real-world'' ECGs observed in clinical practice.
Although these databases permit standardized, quantitative, automated,
and fully reproducible evaluations of analyzer performance, it is
risky to extrapolate from the results of such evaluations to
expectations of real-world performance. Such extrapolations can be
particularly error-prone if the evaluation data were also used for
development of the analysis algorithm, since the algorithm may have
been (perhaps unintentionally) ``tuned'' to its training set. It
should also be noted that the first four of the databases listed above were
obtained from Holter ECG recordings; although the frequency response
of the Holter recording technique is not usually a limiting factor in
the performance of an ECG analyzer, it may tend to favor devices that
are designed to analyze Holter recordings over devices that have been
designed to analyze higher-fidelity input signals.
Next: Evaluation Protocols
Up: Introduction
Previous: Introduction
George B. Moody (george@hstbme.mit.edu)
Sat May 24 04:20:05 EDT 1997