The Long-Term ST Database

The new PhysioNet website is available at: https://physionet.org. We welcome your feedback.

When referencing this material, please cite:

Franc Jager, Alessandro Taddei, George B. Moody, Michele Emdin, Gorazd Antolic, Roman Dorn, Ales Smrdel, Carlo Marchesi, and Roger G. Mark. Long-term ST database: a reference for the development and evaluation of automated ischaemia detectors and for the study of the dynamics of myocardial ischaemia. Medical & Biological Engineering & Computing 41(2):172-183 (2003). [HTML] [PDF]

Please also include the standard citation for PhysioNet:

Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PCh, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE. PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals. Circulation 101(23):e215-e220 [Circulation Electronic Pages; http://circ.ahajournals.org/content/101/23/e215.full]; 2000 (June 13).
[ sample ECG recording ]

The Long-Term ST Database contains 86 lengthy ECG recordings of 80 human subjects, chosen to exhibit a variety of events of ST segment changes, including ischemic ST episodes, axis-related non-ischemic ST episodes, episodes of slow ST level drift, and episodes containing mixtures of these phenomena. The database was created to support development and evaluation of algorithms capable of accurate differentiation of ischemic and non-ischemic ST events, as well as basic research into mechanisms and dynamics of myocardial ischemia.

Half (43) of these 86 recordings, representing 42 of the 80 subjects, were contributed to PhysioNet by the creators of the database in February 2003, and the remaining half of the database was contributed in May 2007. (A corrected version of s30801.dat was also posted together with the second half of the database.) Detailed clinical notes and ST deviation trend plots are provided for all 86 records. The entire Long-Term ST Database is also available from its original home page at the Laboratory for Biomedical Computer Systems and Imaging at the University of Ljubljana, Slovenia.

The individual recordings of the Long-Term ST Database are between 21 and 24 hours in duration, and contain two or three ECG signals. Each ECG signal has been digitized at 250 samples per second with 12-bit resolution over a range of ±10 millivolts. Each record includes a set of meticulously verified ST episode and signal quality annotations, together with additional beat-by-beat QRS annotations and ST level measurements.

For each recording, the first digit in the record name (2 or 3) indicates the number of ECG signals. Records obtained from the same subject have names that differ in the last digit only.

Each record is represented by 12 files, all with the same base name (the record name) and a suffix that identifies the file type:

The measurements in the .16a files were used to construct ST level and deviation functions for each signal, as recorded in the .stf files. (Further details about the .stf, tsr.zip, and .klt.zip files are available here.) ST episodes were identified independently for each signal, based on its ST deviation function and on these criteria:

  1. An episode begins when the magnitude of the ST deviation function first exceeds 50 µV;
  2. The deviation must reach a magnitude of Vmin or more throughout a continuous interval of at least Tmin;
  3. The episode ends when the deviation becomes smaller than 50 µV, provided that it does not exceed 50 µV in the following 30 seconds.

Since differing criteria may be appropriate depending on the application, three sets of ST episode annotations are provided. The annotation codes used in the .sta, .stb, .stc, and .16a files are described here.

For each record, the numbers of ST episodes as determined by each of the three sets of criteria are summarized in an additional text file (with suffix .cnt). The deviation functions and the locations of the episodes are presented graphically in a set of trend plots here. Each record is represented by a 24-hour plot (_00-24.png) and by five 6-hour plots which overlap by one hour (_00-06.png, 05-11.png, etc.).

Development of the Long-Term ST Database was an inter-institutional and international effort coordinated by Prof. Franc Jager of the Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia. Other investigators include: Roman Dorn, PhD, and Ales Smrdel, MSc, of the Faculty of Computer and Information Science, Ljubljana; Dr. Gorazd Antolic of the University Medical Center, Ljubljana, Slovenia; Drs. Alessandro Taddei and Michele Emdin of the CNR Institute for Clinical Physiology (the creators of the European ST-T Database European ST-T Database), Pisa, and Prof. Carlo Marchesi of the University of Firenze, Firenze, Italy; and Dr. Roger Mark and George Moody of the Massachusetts Institute of Technology (the creators of the MIT-BIH Arrhythmia Database), Cambridge, MA, USA, and the Beth Israel Deaconess Medical Center, Boston, MA, USA. The project was supported by Medtronic, Inc. (Minneapolis, MN, USA) and Zymed, Inc. (Camarillo, CA, USA). Development of the Long-Term ST Database began in 1995 and was completed in 2002. We thank all who contributed to this project; further details are here.

Several sources contributed recordings to the Long-Term ST Database:

The annotation of the Long-Term ST Database was performed using SEMIA, a program written by the group in Ljubljana for this purpose. SEMIA provides an interactive graphical user interface to a semi-automated algorithm for measurement of ST levels. Sources for SEMIA, and a precompiled version for GNU/Linux, are available here (as individual files), and as a gzip-compressed tar archive.

Each recording was reviewed independently by expert annotators using SEMIA at each of the three sites (Ljubljana, Pisa, and Cambridge). Participants met several times annually to obtain the consensus reference annotations.

A series of SEMIA screenshots illustrates the annotation process. (Use your browser's Back button to return to this page after following the links to these screenshots in the next paragraph. If you have problems viewing the screenshots in your browser, please read this note.)

The first task faced by the expert annotators was to mark the locations of the PQ junction (the isoelectric level) and the J point, based on 16-second averaged cardiac cycles chosen at frequent intervals throughout the recordings. These marks serve as guideposts for the automated ST level measurement algorithm that performs the next step. The experts then examine the time series of ST level measurements in order to locate and to mark a set of local reference points (marked as LR in the upper panel of the figure). These are used to construct a piecewise linear baseline ST level function, which may vary over time as a result of body position changes or other factors unrelated to ischemia, especially in subjects with prior myocardial infarctions. Axis shifts reflect body position changes, and are usually most apparent in the QRS complexes (note the changes in the QRS principal components, KL1 - KL5, in the lower panel of the figure). By contrast, when ischemic ST changes occur, they are most apparent in the principal components of the ST segment (see the lower panel in this screenshot). Local references are placed before and after each such episode, and the episodes are annotated next. During this process, the expert annotators have the option of viewing either the ST level time series or the ST deviation time series (formed by subtracting the baseline ST level function from the uncorrected ST level time series), as shown in the upper panels of the two screenshots. For further details, see reference 4 below.

Software for producing printed documentation of the Long-Term ST Database is available for Linux or Unix. The software produces compact trend plots of the ST level and ST deviation time series, with indicators of ischemic and non-ischemic ST episodes.

Updates

Franc Jager and Miha Amon have contributed additional sets of time series computed from the ST segments of each normal and non-noisy beat in the database. In each case, they provided time series computed separately for each ECG lead.

Derivation of the Legendre orthonormal-transform normalized and non-normalized coefficient time series, derivation of new single-lead KL basis functions for the ST segments, and derivation of normalized and non-normalized KL coefficient time series is described in reference 5 below.

The kl-single and kl-single-uncentralized projects use different techniques (time domain and KL based respectively) to remove noisy heartbeats. Therefore the KL-Transform is applied on two different covariance matrices derived from two different sets of ST sections, which results in two slightly different sets of basis functions. More importantly however is that only the subsequent kl-single coefficients are centralized by their mean values.

Contacts

For further information, please contact:

Franc Jager
Laboratory of Biomedical Computer Systems and Imaging
University of Ljubljana
Faculty of Computer and Information Science
Trzaska 25
1000 Ljubljana, Slovenia
E-mail: franc.jager@fri.uni-lj.si

Alessandro Taddei
National Research Council (CNR)
Institute of Clinical Physiology
Via Moruzzi 1
56124 Pisa, Italy
E-mail: taddei@ifc.cnr.it

George B. Moody
Harvard-MIT Division of Health Sciences and Technology
Massachusetts Institute of Technology, Room E25-505A
77 Massachusetts Avenue
Cambridge, MA 02139 USA
E-mail: george@mit.edu

Additional References

  1. Franc Jager, George B. Moody, Alessandro Taddei, Gorazd Antolic, Mitja Zabukovec, Maja Skrjanc, Michele Emdin, and Roger G. Mark. Development of a Long-Term Database for Assessing the Performance of Transient Ischemia Detectors. Computers in Cardiology 1996, pp. 481-484, IEEE Press. ISSN 0276-6547. [HTML] [LaTeX] [PostScript] [PDF]
  2. Franc Jager, George B. Moody, Alessandro Taddei, Gorazd Antolic, Ales Smrdel, Boris Glavic, Michele Emdin, Carlo Marchesi, and Roger G. Mark. Research Resources for Development and Evaluation of Transient Ischemia Detectors. Proc. Computer-Aided Data Analysis in Medicine (CADAM 98), Informatica Medica Slovenica, 5(1,2):45-54, 1998. ISSN 1318-2129.
  3. Franc Jager, George B. Moody, Alessandro Taddei, Gorazd Antolic, Michele Emdin, Ales Smrdel, Boris Glavic, Carlo Marchesi, and Roger G. Mark. A Long-Term ST Database for Development and Evaluation of Ischemia Detectors. Computers in Cardiology 1998, pp. 301-304, IEEE Press. ISSN 0276-6547.
  4. Franc Jager, Alessandro Taddei, Michele Emdin, Gorazd Antolic, Roman Dorn, George B. Moody, Boris Glavic, Ales Smrdel, M Varanini, Mitja Zabukovec, Simone Bordigiago, Carlo Marchesi, and Roger G. Mark. The Long-Term ST Database: A Research Resource for Algorithm Development and Physiologic Studies of Transient Myocardial Ischemia. Computers in Cardiology 2000, pp. 841-844. [HTML] [LaTeX] [PostScript] [PDF]
  5. Miha Amon. Robustno ocenjevanje oblik elektrokardiograma z uporabo ortogonalnih transformacij. [Robust estimation of morphologic features and shape representation of electrocardiograms using orthogonal transforms; in Slovene, includes English abstract.] MSc Thesis, 2011, Faculty of Computer and Information Science, University of Ljubljana, Slovenia. [PDF]
  6. Miha Amon, Franc Jager. Electrocardiogram ST-Segment Morphology Delineation Method Using Orthogonal Transformations. PLOS One, February 10, 2016. DOI: 10.1371/journal.pone.0148814.