This code is used in the sample submission of the 2016 PhysioNet/CinC Challenge.
1 July 2017: The journal Physiological Measurement has published a special issue on “Recent advances in heart sound analysis” which includes extened articles and updates on many of the top scoring entries in the competition, and an extensive analysis of the Logistic Regression-HSMM-based Heart Sound Segmentation software package, available on this page.
A detailed description of this segmentation algorithm and its application can be found in:
Please cite the above publications when referencing this material, and also include the standard citation for PhysioNet:
Background
Heart sound segmentation refers to the detection of the exact positions of the first (S1) and second (S2) heart sounds in a heart sound recording or phonocardiogram (PCG). These sounds originate at the beginning of mechanical systole and diastole respectively [1], with S1 occuring immediately after the R-peak (ventricular depolarisation) of the electrocardiogram (ECG), while S2 occurs at approximately at the end-T-wave of the ECG (the end of ventricular depolarisation) [2] (See Figure 1).
Heart sound segmentation is an essential step in the automatic analysis of heart sound recordings, as it allows the analysis of the periods between these sounds for the presence of clicks and murmurs; sounds often indicative of pathology [3]. Segmentation becomes a difficult task when PCG recordings are corrupted by in-band noise. Common noise sources include endogenous or ambient speech, motion artefacts and physiological sounds such as intestinal and breathing sounds. Other physiological sounds of interest, such as murmurs, clicks, splitting of the heart sounds and additional S3 and S4 sounds, can also complicate the identification of the S1 and S2 sounds. For example, Figure 1 shows a PCG (alongside an ECG) with a loud mid-systolic click indicative of mitral valve prolapse, which could be confused for S1 or S2.
Figure 1: Example of an ECG-labelled PCG, with the ECG, PCG and four states of the heart cycle (S1, systole, S2 and diastole) shown. A mid-systolic click can be seen in each of the heart cycles, labelled as "m". The R-peak and end-T-wave are labelled as references for defining the approximate positions of S1 and S2 respectively.
While threshold-based segmentation methods have shown modest success, probabilistic models, such as hidden Markov models (HMMs), have recently been shown to surpass the capabilities of previous methods. Segmentation performance is further improved when a priori information about the expected duration of the states is incorporated into the model, such as in a hidden semi-Markov model (HSMM) [4]. An excellent introduction to HMMs can be found in [5].
The code on this page extends the work of [4] by implementing such a HSMM for segmentation, but extended with the use of logistic regression for emission probability estimation which was found to significantly improve segmentation accuracy. In addition, we implement a modified Viterbi algorithm for decoding the most-likely sequence of states. This modified Viterbi algorithm overcomes the limitation of the standard duration-dependant Viterbi algorithm where the start and end of detected states have to coincide with the start and end of the PCG recording. More details can be found in the paper referenced above.
Explanation of the code
The code at the bottom of this page can be used to both train and execute the HSMM-based segmentation of PCGs. The step-by-step operations of the code are explained below.
Training the HSMM
The HSMM training is performed within the trainSpringerSegmentationAlgorithm.m
file. Within this file, the basic steps are:
- Extract the features from the training set of recordings (
getSpringerPCGFeatures.m
). These include the:- Homomorphic envelope
- Hilbert envelope
- Power spectral density envelope
- Discrete wavelet transform envelope
- Label the PCGs based on the supplied R-peak and end-T-wave locations (
labelPCGStates.m
). - Train the three parameter matrices for the HSMM (
trainBandPiMatricesSpringer.m
).
Running the trained HSMM
The three trained parameter matrices from above are used to segment a PCG in runSpringerSegmentationAlgorithm.m
. The basic operations of the segmentation are:
- Extract the features from a test set recording (
getSpringerPCGFeatures.m
). - Estimate the heart rate based on the autocorrelation (
getHeartRateSchmidt.m
). - Use the extended duration-dependant Viterbi algorithm to derive the most likely sequence of states from the PCG (
lviterbiDecodePCG_Springer.m
).- The duration distributions for the four states used in the Viterbi algorithm are computed in
get_duration_distributions.m
.
- The duration distributions for the four states used in the Viterbi algorithm are computed in
An example of the output of runSpringerSegmentationAlgorithm.m
is shown in blue at the top of Figure 2.
Important options for the code are stored in default_Springer_HSMM_options.m
. These include:
- The original sampling frequency of the PCG signals.
- The sampling frequency to which to downsample the segmentation features.
- Whether
.mex
code should be used for the Viterbi algorithm. - Whether wavelet features should be used (as this toolbox may be missing from your Matlab installation).
If you set springer_options.use_mex = true;
in the options file, please be sure to create the MEX executable from the source file by running: mex viterbi_Springer.c
before running the code.
Figure 2: Example of a segmented PCG using the code on this page. The four HSMM-detected states of the heart cycle (S1, systole, S2 and diastole) are shown at the top of the plot in blue - this is the output of the runSpringerSegmentationAlgorithm.m
file. The red dotted line shows the labelled states, derived from the R-peak and end-T-wave positions in the labelPCGStates.m
file. These labelled states are only used for training the algorithm. The ECG is shown in red at the bottom of the plot, with the positions of the R-peaks (*) and the end-T-waves (o) indicated. Only the PCG is used to derive the detected states in blue.
Quick start guide and example usage
An example of how the model is first trained on a small example set of data, and then used to segment a test recording is given in run_Example_Springer_Script.m
. This should run out-of-the-box (provided you have set the default_Springer_HSMM_options.m
correctly). If you set springer_options.use_mex = true;
in the options file, please be sure to create the MEX executable from the source file by running: mex viterbi_Springer.c
before running the code.
This code loads a few records from the example data. Alongside each recording are the R-peak and end-T-wave annotations which are used to label the locations of the heart sounds in the training set.
The HSMM is trained on five of the six example recordings, and applied to the sixth. Running this script should produce a plot similar to that shown in Figure 3.
Figure 3: The output when running run_Example_Springer_Script.m
. This shows the PCG signal in black, with the HSMM derived states shown in red.
The code in run_Example_Springer_Script.m
will give you a clear example of how to apply these method to new datasets. Tested April 4 2016.
Example Data
The example_data.mat
file contains the full dataset from the paper mentioned at the top of this page. Included in this .mat file is a struct with the audio files, annotations, binary diagnosis (1 = presence of pathology) and the "patient number" for each record.
The example_data.example_audio_data
is sampled at 1000 Hz, and contains recordings from various locations on the chest. While there are 135 patients, there are 792 recordings. This is due to multiple recordings per patient, as well as some recordings being split at positions of inconsistent annotations.
The example_data.example_annotations
are the positions of the R-peak (first column of the cell array) and the end-T-wave positions (second column of the cell array), sampled at 50 Hz. As explained in the associated publication, these annotations are computed based on the agreement between five different automatic R-peak and end-T-wave detectors. They are not human-derived annotations.
The example_data.binary_diagnosis
cell array shows which recordings were from subjects with no pathological heart damage as assessed by echocardiography (binary_diagnosis=0), and those with some sort of heart pathology (most commonly mitral valve prolapse - binary_diagnosis = 1).
The example_data.patient_number
cell array indicates which subject each recording was captured from. Again, there are multiple recordings per patient due to multiple auscultation positions and splitting of recordings due to inconsistent annotations.
References
- Douglas, G., Nicol, E.F., and Robertson, C., Macleod’s Clinical Examination. Elsevier Health Sciences, 2009.
- Tilkian, A.G. and Conover, M.B., Understanding heart sounds and murmurs: with an introduction to lung sounds. Elsevier Health Sciences, 2001.
- Leatham, A. Auscultation of the heart and phonocardiography. Churchill Livingstone: 1975.
- Schmidt, S.E.; Holst-Hansen, C.; Graff, C.; Toft, E.; Struijk, J.J. Segmentation of heart sound recordings by a duration-dependent hidden markov model. Physiol Meas 2010, 31, 513-529.
- Rabiner, L., "A tutorial on hidden Markov models and selected applications in speech recognition," Proceedings of the IEEE, vol. 77, no. 2, pp. 257–286, Feb. 1989.
Name Last modified Size Description
Parent Directory - Figure1.png 07-Apr-2016 01:09 78K Figure2.png 07-Apr-2016 01:11 84K Figure3.png 07-Apr-2016 01:12 79K Hilbert_Envelope.m 04-Apr-2016 21:50 1.9K Homomorphic_Envelope_with_Hilbert.m 04-Apr-2016 21:50 3.4K Liu_HSMM_Heartsound_Algo_IOP_July_2017.pdf 06-Sep-2017 18:43 1.1M SpringerSegmentationCode.zip 07-Apr-2016 15:21 73M butterworth_high_pass_filter.m 04-Apr-2016 21:50 2.8K butterworth_low_pass_filter.m 04-Apr-2016 21:50 2.8K default_Springer_HSMM_options.m 04-Apr-2016 21:50 1.6K example_data.mat 04-Apr-2016 21:50 73M expand_qt.m 04-Apr-2016 21:50 2.2K fullsizedfigures/ 07-Apr-2016 01:07 - getDWT.m 04-Apr-2016 21:50 2.5K getHeartRateSchmidt.m 04-Apr-2016 21:50 3.7K getSpringerPCGFeatures.m 04-Apr-2016 21:50 4.1K get_PSD_feature_Springer_HMM.m 04-Apr-2016 21:50 2.5K get_duration_distributions.m 04-Apr-2016 21:50 3.7K labelPCGStates.m 04-Apr-2016 21:50 6.9K normalise_signal.m 04-Apr-2016 21:50 1.4K runSpringerSegmentationAlgorithm.m 04-Apr-2016 21:50 2.7K run_Example_Springer_Script.m 04-Apr-2016 21:50 2.2K schmidt_spike_removal.m 04-Apr-2016 21:50 4.3K trainBandPiMatricesSpringer.m 04-Apr-2016 21:50 5.9K trainSpringerSegmentationAlgorithm.m 04-Apr-2016 21:50 3.7K viterbiDecodePCG_Springer.m 04-Apr-2016 21:50 15K viterbi_Springer.c 04-Apr-2016 21:50 13K local.css 07-Apr-2016 01:06 3.1K C# source file COPYING 07-Apr-2016 00:39 34K GNU General Public License
If you would like help understanding, using, or downloading content, please see our Frequently Asked Questions. If you have any comments, feedback, or particular questions regarding this page, please send them to the webmaster. Comments and issues can also be raised on PhysioNet's GitHub page. Updated Friday, 28-Oct-2016 22:58:42 CEST |
PhysioNet is supported by the National Institute of General Medical Sciences (NIGMS) and the National Institute of Biomedical Imaging and Bioengineering (NIBIB) under NIH grant number 2R01GM104987-09.
|