Reach Us +44-1467840001
Which Mathematical and Physiological Formulas are Describing Voice Pathology: An Overview | OMICS International
ISSN: 2329-9126
Journal of General Practice
Make the best use of Scientific Research and information from our 700+ peer reviewed, Open Access Journals that operates with the help of 50,000+ Editorial Board Members and esteemed reviewers and 1000+ Scientific associations in Medical, Clinical, Pharmaceutical, Engineering, Technology and Management Fields.
Meet Inspiring Speakers and Experts at our 3000+ Global Conferenceseries Events with over 600+ Conferences, 1200+ Symposiums and 1200+ Workshops on Medical, Pharma, Engineering, Science, Technology and Business

Which Mathematical and Physiological Formulas are Describing Voice Pathology: An Overview

Pedersen M*, Jønsson A, Mahmood S and Alexius Agersted A

The Medical Center, Oestergade 18, 1. 1100 Copenhagen, Denmark

*Corresponding Author:
Pedersen M
MD. M. Eeg Cand. Stat
The Medical Center
Oestergade 18, 1. 1100 Copenhagen, Denmark
Tel: +4531126184
E-mail: [email protected]

Received date: April 19, 2016; Accepted date: May 20, 2016; Published date: May 27, 2016

Citation: Pedersen M, Jønsson A, Mahmood S, Agersted A A (2016) Which Mathematical and Physiological Formulas are Describing Voice Pathology: An Overview. J Gen Pract 4: 253. doi: 10.4172/2329-9126.1000253

Copyright: © 2016 Pedersen M, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Visit for more related articles at Journal of General Practice


This study focuses upon changes in quantifiable parameters of voice production comparing normal voices and patients with complaints of hoarseness for more than two weeks. Acoustical signals and high speed films were data sources for mathematical and physiological formulas statistics of the voices. The software ”Glottis Analysis Tools” (Erlangen, Germany) includes acoustical measurements and data sources in Glottal Area Waveforms (GAW) and Phonovibrograms (PVG), based on high speed film data. High speed film data were captured with high speed camera and software from Wolf Ltd, Germany. Data with statistical significant difference between 12 healthy voices and 12 patients with complaints of hoarse voices in a prospective case/control study were presented. The commonly used acoustical and physiological parameters showed hardly any statistical difference between the normal persons and the persons with complaints of hoarseness for more than two weeks. This suggests that evidence on physiological and acoustical measures of voice pathology is insufficient. Focus should be upon newer methods and tissue function.


Voice Pathology; Phonovibrograms; Acoustical signals


The inclusion criteria for this study were complaints of hoarseness for more than 2 weeks. The goal was to provide quantifiable protocols for determining if a voice was pathological or not. The following references were presented to show how far this field of research is in evidence:

In a Cochrane review the purpose was to assess the effectiveness of surgery versus non-surgical interventions for vocal cord nodules also diagnosed with acoustical measures and physiologicalvoice diagnostics. No suitable trials were identified. No studies fulfilled the inclusion criteria of hoarseness and vocal nodules. It was concluded that there is a need for high-quality randomized controlled trials to evaluate the effectiveness of surgical and non-surgical treatment of vocal cord nodules [1].

Another study determined the reliability of objective voice measures of normal speaking voices used commonly in clinical practice of 18 healthy volunteers (nine males and nine females). Laryngeal efficiency and perturbation measures of fundamental frequency (F0) for both genders were made. For female cepstral peak prominence (CPP) had moderate reliability, whereas for males, the smooth CPP was reliable. A noise-to-harmonic ratio (NHRs) has the lowest consistency of all measures over the course. The authors concluded that additional research are needed to investigate which factors within the testing protocol and/or changes to the measurement instruments may lead to more consistent test results [2].

In a review focus was on evidence-based clinical voice assessment. The goal of the study was to determine what exists of research evidence, and to support the use of voice measures in the clinical assessment of patients with voice disorders. The literature provides some evidence for selected acoustic, laryngeal imaging-based, auditory-perceptual, functional, and aerodynamic measures to be used as components in a clinical based voice evaluation. The authors found a pressing need for high-quality research that is specially designed to expand the evidence base for clinical voice assessment [3].

Therefore, we made a comparison of normal persons versus patients with complaints of hoarseness in order to evaluate the possible validity of acoustical and video-derived physiological measurements in a prospective case control study, as a basis for more evidence related approaches of voice pathology.


The prospective cohort study included 12 normal persons without voice complaint and 12 with hoarseness for more than 2 weeks. High speed films were made with the Wolf Ltd equipment and the “Glottis Analysis Tools” program were carried out on all 24 clients (Table 1) based on the combined hard/software. All 24 clients had data sets of 345 parameters, presenting our statistical material.

Source: Audio Amplitude-Symmetry* PPQ-11(%) Cycle-duration(ms)
APF(%) Amplitude-Symmetry-Index PPQ-3(%) EPF(%)
APQ-11(%) APF(%) PPQ-5(%) EPQ-11(%)
APQ-3(%) APQ-11(%) PVI EPQ-3(%)
APQ-5(%) APQ-3(%) RAP-v1 EPQ-5(%)
AVI APQ-5(%) RAP-v2 Fundamental-Freq(Hz)
CHNR-v1(dB) Asymmetrie-Quotient Rate-Quotient(RQ) Glottal-Area-Index(AC/OQ)
CHNR-v2(dB) AVI Shim(%) Glottis-Gap-Index(GGI)
CPP(dB) CHNR-v1(dB) SNR-v1(dB) GNE
Cycle-duration(ms) CHNR-v2(dB) SNR-v2(dB) Harmonics-Intensity(%)
EPF(%) Closing-Quotient(ClQ) Spatial-Symmetry* HNR(dB)
EPQ-11(%) CPP(dB) Spatial-Symmetry-Index Jitt(%)
EPQ-3(%) Cycle-duration(ms) Spectral-Flatness(SFM) Jitt-Factor
EPQ-5(%) DynamicRange-Symmetry* Speed-Index(SI) Jitt-Ratio
Fundamental-Freq(Hz) DynamicRange-Symmetry-Index Speed-Quotient(SQ) max-Harmonic(Hz)
GNE EPF(%) Stiffness Maximum-Area-Declination-Rate
Harmonics-Intensity(%) EPQ-11(%) Time-Periodicity max-WMC
HNR(dB) EPQ-3(%) Waveform-Symmetry-Index mean-Jitt(ms)
Jitt(%) EPQ-5(%) Souce: Phonovibrogram (PVG) mean-Shim(dB)
Jitt-Factor Fundamental-Freq(Hz) ContourAngles-Symmetry* mean-WMC
Jitt-Ratio Glottal-Area-Index(AC/OQ) ContourAngles-Symmetry-Index min-Subharmonic(Hz)
max-Harmonic(Hz) Glottis-Gap-Index(GGI) Contour-Angle(DEG) NNE(dB)
max-WMC GNE Source: Trajectories 50% Open-Quotient(OQ)
mean-Jitt(ms) Harmonics-Intensity(%) Amplitude-Symmetry* Peak-Acceleration
mean-Shim(dB) HNR(dB) Amplitude-Symmetry-Index Peak-Closing-Velocity
mean-WMC Jitt(%) DynamicRange-Symmetry* Plateau-Quotient(PQ)
min-Subharmonic(Hz) Jitt-Factor DynamicRange-Symmetry-Index PPF(%)
NNE(dB) Jitt-Ratio Phase-Asymmetry* PPQ-11(%)
PPF(%) max-Harmonic(Hz) Phase-Asymmetry-Index PPQ-3(%)
PPQ-11(%) Maximum-Area-Declination-Rate Waveform-Symmetry-Index PPQ-5(%)
PPQ-3(%) max-WMC Amplitude-Length-Ratio PVI
PPQ-5(%) mean-Jitt(ms) Amplitude-Periodicity RAP-v1
PVI mean-Shim(dB) Amplitude-Quotient RAP-v2
RAP-v1 mean-WMC APF(%) Rate-Quotient(RQ)
RAP-v2 min-Subharmonic(Hz) APQ-11(%) Shim(%)
Shim(%) NNE(dB) APQ-3(%) SNR-v1(dB)
SNR-v1(dB) Open-Quotient(OQ) APQ-5(%) SNR-v2(dB)
SNR-v2(dB) Peak-Acceleration Asymmetrie-Quotient Spectral-Flatness(SFM)
Spectral-Flatness(SFM) Peak-Closing-Velocity AVI Speed-Index(SI)
Source: GAW Phase-Asymmetry* CHNR-v1(dB) Speed-Quotient(SQ)
Amplitude-Length-Ratio Phase-Asymmetry-Index CHNR-v2(dB) Stiffness
Amplitude-Periodicity Plateau-Quotient(PQ) Closing-Quotient(ClQ) Time-Periodicity
Amplitude-Quotient PPF(%) CPP(dB)  

Table 1: Overview of some measured parameters in “Glottal Analysis Tools” used for 12 normal persons compared with 12 patients with complaints of hoarseness for more than two weeks [4,5].


In the study mathematical and physiological formulas were focused upon from the high quality high speed films with 4000 pictures per second (Wolf Ltd. Germany) with the advanced software “Glottis Analysis Tools” (Erlangen Germany), including acoustical measurements and the following physiological data sources: Glottal Area Waveform (Figure 1), Trajectory-50% (Figure 2) and Phonovibrograms (Figure 3). Attached to the scope is the microphone acquiring the acoustical signal (Wolf Ltd., Germany). In Table 1, an overview of the quantitative parameters is given. Due to the importance of the lack of evidence in acoustical formulas, we discussed some formulas in this study. Formulas and data sources were therefore presented. Many of the formulas were on different data sources (Glottal area waveform, trajectories 50% or acoustical measures) as they all are close to sinusoidal signals - that can be analyzed. This includes jitter and shimmer.


Figure 1: Glottal Area Waveform presentation. ”Space curves” – the area between the vocal folds is calculated and plotted in a curve. The curves switch between green and blue to indicate different cycles of vocal fold movement in the software system from Erlangen, Germany.


Figure 2: Trajectories ("Quantitative kymography")


Figure 3: High speed films with phonovibrogram of single movements of the right and left vocal folds. Phonovibrogram of a contest winning female, showing the regularity of single movement of the right and left vocal folds.


The image is an electronic representation of the rima glottidis. The dark blue line defines the left vocal fold. The red line delimits the right vocal fold. The blue dotted line in the middle is the center line between the vocal folds. The vocal fold movements are calculated from this line.

The left chart illustrates a computed cycle. The dark blue curve is the left vocal fold fluctuation, and the red curve is the right vocal fold fluctuation.

50% is an indication that the chart depicts the vocal folds in 50% distance from the posterior limit (and therefore 50% distance to the anterior limit) = trajectory-50%.

The purple line in the computed image indicates, where Traj-50% downloads the numbers from.

Examples and formulas from Table 1 are given:

Cepstral harmonics-to-noise ratio (CHNR)


Cepstral peak prominence

CPP (dB) is defined as the difference in amplitude between the cepstral peak and the corresponding value on the regression line computed between 1 ms and the maximum quefrency (i.e., the predicted cepstral magnitude for the quefrency at the cepstral peak) [4].

Contour angles of phonovibrograms (PVG)

Contour-Angles (deg) are calculated in both anterior and posterior parts during opening as well as closing of vocal folds for the left and right side of PVG, respectively. Hence, C Ai side, Item denotes the Contour-Angles for ith cycle, where side represents the corresponding side of PVG: L for Left side and R for Right side. Item signifies the position of related Contour- Angle: OA: Opening – Anterior, OP: Opening – Posterior CA: Closing – Anterior and OP: Closing – Posterior

Energy perturbation quotient 5% & 11%


Where k represents the number of cycles considered for computation of quotients: k = 3: EPQ-3 (%), k = 5: EPQ-5 (%) and k = 11: EPQ-11 (%). Furthermore, E (i) – signal energy within a ith cycle and N – the number of analyzed cycles (equivalent to the number of elements E).

In Glottis Analysis Tools the following energy-related parameters are calculated:

Harmonics intensity


These measures can be calculated for the following signals: Glottal area waveform (GAW), Acoustics and Glottal trajectories. Furthermore: F(k)-kth coefficient of Fourier transform of the signal (k = 0 - the DC component) and C(k)-kth Cepstrum coefficient


ω0-index of Fourier coefficient represents fundamental frequency (f0), Hmax – maximum order of harmonics for f0, ωmin – index of Fourier coefficient represents minimum occurring subharmonic for f0.

Normalized noise energy


Signal-to-noise ratio


Spectral flatness


Minimum subharmonics

Min-Subharmonic (Hz) – minimum occurring subharmonic frequency (fundamental frequency is the multiple of this frequency) in Hz. Further formulas are presented: Jitter % because it is commonly used, Shimmer % because it is commonly used, Stiffness because it might be interesting in singers and Amplitude symmetry index because earlier analyses showed signs of significance.

Jitter %


Shimmer, mean & shimmer%


Shimmer is strength variation and it is measured at the maximum amplitude of all measuring points. A(i) is the dynamic range (maxmin) of the ith cycle and N is the number of analyzed cycles (equivalent to the number of elements in A(i)).

Stiffness (from data sources Glottal Area Waveform (GAW) and traj-50%)


Where Ti is the duration of the ith cycle in milliseconds (ms). Ai is the dynamic range (max – min) of ith cycle. s(t) is the magnitude of the 1st derivative of the considered signal for ith cycle (t ⊂ Ti).

Amplitude symmetry index (GAW and traj-50%)


GAi= Glottal area waveform for the ith cycle, L = Left side and R = Right side.


Results of the calculation with Glottis Analysis Tools were made on 12 healthy voices, and 12 patients with complaints of hoarseness for more than two weeks in a prospective case/control study of the given parameters (Table 1). Spearman correlation between variables related to the high speed films and acoustic measurements made at the same was calculated for a total of 345 combinations. The variables related to the high speed films were analyzed in an analysis of variance including gender and hoarse/healthy as fixed effects. As a measure of diagnostic value, the mean difference between the population of hoarse and population of healthy persons have been estimated and is shown in (Table 2) for the variables with the most statistical difference. Similarly, Table 3a shows the mean difference between hoarse and healthy persons for the commonly used parameters of Jitter and Shimmer. Table 3b is a continuation of commonly used parameters - between 12 normal persons and 12 persons with hoarseness. Figure 4 shows a scatterplot of parameter with the most statistical difference between hoarse and healthy persons.

  Parameter Source Type Mean difference healthy-hoarse Standard Error DF T Value Pr>|T|
1 Cepstral Harmonics-to-Noise Ratio-v2(dB) [GAW]   10,63 4,41 22 2,41 0,02
2 Cepstral Harmonics-to-Noise Ratio-v2(dB) [GAW] [Left] 11,89 4,81 20 2,47 0,02
3 Cepstral Harmonics-to-Noise Ratio-v2(dB) [GAW] [Right] 8,56 4,21 22 2,03 0,05
4 Cepstral Harmonics-to-Noise Ratio-v2(dB) [Traj-50%] [Left] 10 4,33 21 2,31 0,03
5 Cepstral Peak Prominence(dB) [GAW] [Left] 0,58 0,26 20 2,2 0,04
6 Cepstral Peak Prominence(dB) [Traj-50%] [Right] 0,33 0,17 22 2 0,06
7 Contour-Angle(DEG) [PVG] [Left] 10,23 4,3 20 2,38 0,03
8 Energy Perturbation Quotient-5(%) [Traj-50%] [Left] -9,06 3,53 21 -2,56 0,02
9 Harmonics-Intensity(%) [GAW]   4,1 1,45 22 2,83 0,01
10 Harmonics-Intensity(%) [GAW] [Left] 3,17 1,25 20 2,53 0,02
11 Harmonics-Intensity(%) [GAW] [Right] 3,41 1,41 22 2,42 0,02
12 Harmonics-Intensity(%) [Traj-50%] [Left] 2,8 1,3 21 2,16 0,04
13 Normalized Noise Energy(dB) [GAW] [Left] -3,38 1,39 20 -2,42 0,03
14 Period Perturbation Quotient-11(%) [GAW] [Left] -1,89 0,84 19 -2,25 0,04
15 Period Perturbation Quotient-11(%) [GAW] [Right] -2,17 0,93 21 -2,33 0,03
16 Signal-to-Noise Ratio-v1(dB) [GAW]   1,15 0,56 22 2,06 0,05
17 Signal-to-Noise Ratio-v1(dB) [GAW] [Left] 1,32 0,6 20 2,19 0,04
18 Signal-to-Noise Ratio-v1(dB) [GAW] [Right] 1,03 0,51 22 2,01 0,06
19 Spectral-Flatness(SFM) [GAW]   -2,74 1,2 22 -2,28 0,03
20 minimum-Subharmonic(Hz) [GAW]   -81,06 40,25 22 -2,01 0,06
21 minimum-Subharmonic(Hz) [GAW] [Right] -83,42 39,61 22 -2,11 0,05
22 minimum-Subharmonic(Hz) [Traj-50%] [Left] -153,85 23,88 21 -6,44 <,0001

Table 2: “Glottis Analysis Tools” measures analyze in an analysis of variance estimating mean difference between healthy and hoarse persons (adjusting for gender), in a prospective case control study of 12 normal persons and 12 patients with complaints of hoarseness for more than two weeks.

Parameter Source Type Estimate Standard Error DF t Value Pr>|t|
Jitt(%) [Audio]   0,31 3,56 22 0,09 0,93
Jitt(%) [GAW]   -1,42 1,44 22 -0,99 0,33
Jitt(%) [GAW] [Left] -1,84 1,51 20 -1,23 0,23
Jitt(%) [GAW] [Right] -2,04 1,32 22 -1,55 0,14
Jitt(%) [Traj-50%] [Left] -0,74 1,87 21 -0,39 0,7
Jitt(%) [Traj-50%] [Right] -1,32 1,46 22 -0,9 0,38
Jitt-Factor [Audio]   0,44 3,61 22 0,12 0,9
Jitt-Factor [GAW]   -1,6 1,47 22 -1,09 0,29
Jitt-Factor [GAW] [Left] -2,03 1,54 20 -1,32 0,2
Jitt-Factor [GAW] [Right] -2,08 1,29 22 -1,62 0,12
Jitt-Factor [Traj-50%] [Left] -0,65 1,9 21 -0,34 0,74
Jitt-Factor [Traj-50%] [Right] -1,38 1,5 22 -0,92 0,37
Jitt-Ratio [Audio]   3,1 35,61 22 0,09 0,93
Jitt-Ratio [GAW]   -14,18 14,38 22 -0,99 0,34
Jitt-Ratio [GAW] [Left] -18,45 15,06 20 -1,23 0,23
Jitt-Ratio [GAW] [Right] -20,37 13,17 22 -1,55 0,14
Jitt-Ratio [Traj-50%] [Left] -7,37 18,67 21 -0,39 0,7
Jitt-Ratio [Traj-50%] [Right] -13,19 14,59 22 -0,9 0,38
Shim(%) [Audio]   1,27 21,82 22 0,06 0,95
Shim(%) [GAW]   -0,61 0,54 22 -1,13 0,27
Shim(%) [GAW] [Left] -1,21 0,65 20 -1,86 0,08
Shim(%) [GAW] [Right] -0,73 0,86 22 -0,84 0,41
Shim(%) [Traj-50%] [Left] -6,53 6,34 21 -1,03 0,31
Shim(%) [Traj-50%] [Right] 1,9 4,07 22 0,47 0,64

Table 3a: The commonly used parameters of Jitter and Shimmer shows no statistical difference in “Glottis Analysis Tools” between 12 normal persons and 12 persons with complaints of hoarseness in a prospective case control study (SAS program 9,4 F-test, adjusted for gender).

Parameter Source Type Estimate Standard Error DF t Value Pr>|t|
Stiffness [GAW]   0,01 0,02 20 0,57 0,57
Stiffness [GAW] [Left] 0,02 0,03 18 0,58 0,57
Stiffness [GAW] [Right] 0,01 0,03 20 0,37 0,72
Stiffness [Traj-50%] [Left] -0,01 0,03 19 -0,21 0,84
Stiffness [Traj-50%] [Right] 0 0,03 20 -0,15 0,88
Amplitude-Length-Ratio [GAW]   -0,24 0,55 20 -0,44 0,66
Amplitude-Length-Ratio [GAW] [Left] -0,05 0,32 18 -0,16 0,87
Amplitude-Length-Ratio [GAW] [Right] -0,31 0,33 20 -0,93 0,36
Amplitude-Length-Ratio [Traj-50%] [Left] -0,01 0,01 19 -0,92 0,37
Amplitude-Length-Ratio [Traj-50%] [Right] -0,02 0,01 20 -1,6 0,12
Amplitude-Periodicity [GAW]   0,03 0,03 20 1,16 0,26
Amplitude-Periodicity [GAW] [Left] 0,05 0,03 18 1,82 0,09
Amplitude-Periodicity [GAW] [Right] 0,03 0,03 20 0,98 0,34
Amplitude-Periodicity [Traj-50%] [Left] 0,03 0,03 19 1,19 0,25
Amplitude-Periodicity [Traj-50%] [Right] 0,02 0,03 20 0,48 0,63
Amplitude-Quotient [GAW]   0,11 0,31 20 0,35 0,73
Amplitude-Quotient [GAW] [Left] 0,01 0,32 18 0,05 0,96
Amplitude-Quotient [GAW] [Right] 0,04 0,35 20 0,1 0,92
Amplitude-Quotient [Traj-50%] [Left] 0,01 0,26 19 0,05 0,96
Amplitude-Quotient [Traj-50%] [Right] -0,2 0,29 20 -0,7 0,49
Amplitude-Symmetry* [GAW]   0,1 0,13 20 0,76 0,46
Amplitude-Symmetry* [Traj-50%]   -1316,17 1447,91 20 -0,91 0,37
Amplitude-Symmetry-Index [GAW]   0,03 0,04 20 0,79 0,44
Amplitude-Symmetry-Index [Traj-50%]   0,07 0,07 20 1,07 0,3

Table 3b: Commonly used parameters continued (SAS program 9,4 F-test, adjusted for gender).


Figure 4: Minimum Subharmonics for left vocal fold presenting 12 normal persons and 12 persons complaining of hoarseness.

The first purpose was to characterize the distribution of the parameters, not to compare the two groups. To our knowledge, the study will be the first of its nature to describe the parameters and therefore the study will provide important contribution to generate hypothesis in future research which include a bigger amount of persons to show the differences of voice pathology. There was no significant difference between males and females.

Discussion and Conclusion

The “Glottis Analysis Tools” analysis program is one of the most updated voice analysis program and an interesting supplement of acoustical and physiological voice analysis, as it operates on vocal fold level in comparison with acoustical analysis on high speed films. The prognostic values of the results are important. Jitter and shimmer and many other acoustical measurements have been shown not to differentiate between healthy and hoarse persons. A few of the comparisons between hoarse and healthy persons have some significance (Table 2). Maybe they can be used to compare difference objective measurements measures in the future [5]. Till now estimates of levels of hoarseness are not optimal. The acoustical measures of voices show very little statistical differences between 12 normal persons and 12 patients with complaints of hoarseness in our prospective case control study. This seems to further establish that voice measures till now are not clinically evidence based as such. Some glottal area waveform measures are of interest but randomized studies are lacking. The new methods should be focused upon: Overtones/ harmonics [6] as well as tissue evaluation and Narrow Band Imaging [7], as well as Optical Coherence Tomography [8].


Select your language of interest to view the total content in your interested language
Post your comment

Share This Article

Article Usage

  • Total views: 8583
  • [From(publication date):
    June-2016 - Jan 23, 2020]
  • Breakdown by view type
  • HTML page views : 8465
  • PDF downloads : 118

Review summary

  1. Xavier Campbell
    Posted on Oct 17 2016 at 2:00 pm
    The authors made a comparison of normal persons versus patients with complaints of hoarseness in order to evaluate the possible validity of acoustical and video-derived physiological measurements. The result in this prospective case control study is important and it can be the basis for more evidence related approaches of voice pathology.

Post your comment

captcha   Reload  Can't read the image? click here to refresh
Peer Reviewed Journals

Make the best use of Scientific Research and information from our 700 + peer reviewed, Open Access Journals