alexa
Reach Us +44-1474-556909
Human Voice Activity Detection using Wavelet | OMICS International
ISSN: 2168-9679
Journal of Applied & Computational Mathematics
Make the best use of Scientific Research and information from our 700+ peer reviewed, Open Access Journals that operates with the help of 50,000+ Editorial Board Members and esteemed reviewers and 1000+ Scientific associations in Medical, Clinical, Pharmaceutical, Engineering, Technology and Management Fields.
Meet Inspiring Speakers and Experts at our 3000+ Global Conferenceseries Events with over 600+ Conferences, 1200+ Symposiums and 1200+ Workshops on Medical, Pharma, Engineering, Science, Technology and Business
All submissions of the EM system will be redirected to Online Manuscript Submission System. Authors are requested to submit articles directly to Online Manuscript Submission System of respective journal.

Human Voice Activity Detection using Wavelet

Md. Shahadat Hossain1* and Md. RafiqulIslam2

1B.Sc (Honors), Mathematics Discipline, Khulna University, Khulna, 9208, Bangladesh

2Mathematics Discipline, Khulna University, Khulna, 9208, Bangladesh

*Corresponding Author:
Md. Shahadat Hossain
Mathematics Discipline
Khulna University, Khulna
9208, Bangladesh
E-mail: [email protected]

Received Date: April 06, 2014; Accepted Date: April 17, 2015; Published Date: May 07, 2015

Citation: Hossain S, RafiqulIslam (2015) Human Voice Activity Detection using Wavelet. J Appl Computat Math 4: 216. doi: 10.4172/2168-9679.1000216

Copyright: © 2015 Hossain S, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Visit for more related articles at Journal of Applied & Computational Mathematics

Abstract

Wavelet has wide range of use in the present scientific universe. At present using wavelet through MATLAB different types of tasks are done. For instance biometric recognition (fingerprint recognition, voice recognition, iris recognition, face recognition, pattern recognition and signature recognition), signal processing, human voice activity detection etc. are done using wavelet and wavelet transform. Among these here I have discussed about “Human Voice Activity Detection”. At first a human voice is taken as the input sound to MATLAB command window using a good headphone for a few second. Then the sound taken as input give a graphical representation that is saved for future activities. After that using the wavelet toolbox of MATLAB the image of the input sound is taken for analyzing it. Using discrete wavelet transform the image is analyzed. During this analysis a “10 level wavelet” tree is generated by Haar wavelet with 10 decomposition level. At the same time the original signal is reconstructed. At the first time six different human voice activities of the same persons are analyzed. The Norm and the SNR (Signal to Noise Ratio) are counted. The data of the SNR are counted in decibel (db.) unit. Also the bit rates of the three different voice are counted. In this way total 18 different experiments are done for the different five persons where except the first person for all the person three experiments are dine.. The numerical data of the experiments are shown as graphical representation as well as in histogram analysis. In this process the whole experiments are done for the activity detection of human voice.

Keywords

Wavelet; SNR; Bit rate; Human voice; Histogram

Introduction

Recently, human-machine interface system based on speech attracts much interest, supporting with the rapid improvement of the CPU performance. The speech-based interface is greatly based on speech recognition, in which the information of voice activity segments (VAS) is effective to improve their cognition rate. For the voice activity detection, various methods have been proposed. They use the features of speech signal, such as transition of the power [1], harmonic structure in spectrum [2,3] and the existence of signal source directionality [4]. In these methods, acquired speech is usually assumed to be sufficiently clean, due to the preprocessing used in speech recognition and compression for transmission. However at indoor environments where the interface is ordinarily used, there are various localized interferences arriving from particular direction such as the sound of closing door, etc. For these non-stationary interferences, the conventional methods do not realize sufficient performance, because of stationarity and whiteness assumption to noise (Table 1). Kaneda [5,6] proposed an effective VAD method available for these non-stationary interferences, using their high performance speech emphasizing system”AMNOR (Adaptive Microphone array for Noise Reduction)” (Figure 1). He uses microphone array to discriminate signals utilizing direction difference between speech and interference. However, target speech and interference are required to arrive from sufficiently separated direction due to the spatial resolution in AMNOR. This limitation critically restricts the applicable condition of the method. In this research, we propose a new method to be robust to the direction of interference, with microphone array signal processing in the wavelet domain to integrate the time, frequency and spatial information of speech signal (Table 2).

  1st voice 2nd voice 3rd voice
L1Norm 481.8 587.5 570.1
L2Norm 3.84 5.05 4.78
SNR 6.81 6.00 5.81

Table 1: Data chart for Mr. “A”.

applied-computational-mathematics-aspeech-segment

Figure 1: (a) Aspeech segment consisting of voiced, unvoiced and silence frames, (b) Power variation of detail coefficients.

  1st voice 2nd voice 3rd voice
L1Norm 491.2 842.8 1157
L2Norm 3.51 7.00 10.02
SNR 5.83 5.87 5.72

Table 2: Data chart for Mr. “A”.

Voice Activity Detection

Voice activity detection (VAD) refers to the problem of distinguishing speech from non-speech regions (segments) in an audio stream. The non-speech regions could include silence, noise, or a variety of other acoustic signals. VAD is challenging in low signal-tonoise ratio (SNR), especially in non-stationary noise, because both low SNR and a non-stationary noisy environment tend to cause significant detection errors (Figure 2). There is a wide range of applications for VAD, including mobile communication services [5], real-time speech transmission on the internet [6], noise reduction for digital hearing aid devices [7], automatic speech recognition [8], and variable rate speech coding [9]. Voice activity detection (VAD), also known as speech activity detection or speech detection, is a technique used in speech processing in which we detect the low bit rate speech and high bit rate speech (Table 3), also can distinguish between high vocal speech and low vocal speech of human [10]. The main uses of VAD are in speech coding and speech recognition (Figure 3). It can facilitate speech processing, and can also be used to deactivate some processes during non-speech section of an audio session [11]. VAD is an important enabling technology for a variety of speech-based applications. Therefore various VAD algorithms have been developed that provide varying features and compromises between latency, sensitivity, accuracy and computational cost. Some VAD algorithms also provide further analysis (Table 4), for example whether the speech is voiced, unvoiced or sustained. Voice activity detection is usually language independent. There are many voice detection technique already exists like CMU Sphinx, Julius, kaidi, Bing, SILVIA, Vlingo, Microsoft Tellme, Ask Ziggy, wavelet etc. Among these technique wavelet is used and compressed signal by wavelet technique and which gives better results for lossless compression (Figures 4 and 5). The practical implementation of voice signal compression schemes is very similar to that of sub band coding schemes. As is case sub band coding, we compress the signal (analysis) using different wavelets (Figure 6). The output of the compression is dawn sampled and comparison among the compression signal (Table 5). Wavelet analysis can be used to divide the information of a signal into approximation and detail sub signals shows the vertical and horizontal details or changes in the signal. If these details are very small then they can be set to zero without significantly zero knows as threshold. The greater the number of zeros the greater the compression ratio. The amount of information retained by a signal after compression and decompression is known as the retained energy and this is proportional to the sum of square of the matrix values. If the energy retained 100% then the compression is known as lossless as the signal can be reconstructed exactly. This occurs when the threshold value is set to zero, meaning that the detail has not been changed. Ideally, during compression the number of zeros are obtained more energy retention will be as high as possible (Figure 7) [12]. We know wavelet packet to perform significantly better than wavelets for compression of signals with large amount texture and it is also point out the perceived signal quality is significantly improved using wavelet packets instead of wavelets especially in the textured regions of the signals. This chapter deals with speech compression based on discrete wavelet transforms (Table 6). We used English words (only hello) for this experiment. We have successfully compressed and reconstructed the words with perfect audibility by using wavelet technique (Figure 8). Speech compression is the technology of converting human speech into an efficiently encoded representation that can later be decoded to produce a close approximation of the original signal. The wavelet transform of a signal decomposes the original signal into wavelets coefficients at different scales and positions (Figure 9). These coefficients represent the signal in the wavelet domain and all data operations can be performed using the corresponding wavelet coefficients (Table 7a).

applied-computational-mathematics-original-speech-mr-a

Figure 2: Original speech signal of Mr. “A” with 10 level decomposition and decomposition tree.
L1 Norm = 481.8 L2 Norm = 3.845 L2 Norm = 4.78
Signal to noise ratio=6.81 db

applied-computational-mathematics-original-speech-mr-a

Figure 3: Original speech signal of Mr. “A” with 10 level decomposition and decomposition tree.
L1 Norm = 570.1
L2 Norm = 4.78
Signal to noise ratio = 5.87 db

applied-computational-mathematics-three-experimental-mr-a

Figure 4: Three experimental voice data chart of Mr.“A”.

applied-computational-mathematics-original-speech-mr-a

Figure 5: Original speech signal of Mr. “A” with 10 level decomposition and decomposition tree.
L1 Norm = 491.2, L2 Norm = 3.51
Signal to noise ratio = 5.83 db

applied-computational-mathematics-original-speech-mr-a

Figure 6: Original speech signal of Mr. “A” with 10 level decomposition and decomposition tree.
L1 Norm = 842.8, L2 Norm = 7.00
Signal to noise ratio = 5.87 db

applied-computational-mathematics-original-speech-mr-a

Figure 7: Original speech signal of Mr. “A” with 10 level decomposition and decomposition tree
L1 Norm = 1157, L2 Norm = 10.02
Signal to noise ratio = 5.72 db

applied-computational-mathematics-three-experimental-mr-a

Figure 8: Three experimental voice data chart of Mr.”A”.

applied-computational-mathematics-original-speech-mr-b

Figure 9: Original speech signal of Mr “B” with 10 level decomposition and decomposition tree.
L1 Norm = 789.9
L2 Norm = 7.064
Signal to noise ratio = 7.40 db

  1st voice 2nd voice 3rd voice
L1Norm 789.9 471.0 559.8
L2Norm 7.06 3.81 4.90
SNR 7.40 7.68 7.06

Table 3: Data chart for Mr. “C”.

  1st voice 2nd voice 3rd voice
L1Norm 463.7 561.7 665.4
L2Norm 3.54 4.46 5.77
SNR 7.36 6.64 6.00

Table 4: Data chart for Mr. “C”.

  1st voice 2nd voice 3rd voice
L1Norm 450.3 690.0 680.0
L2Norm 3.26 5.70 5.23
SNR 6.90 6.38 5.73

Table 5: Data chart for Mr. “D”.

  1st voice 2nd voice 3rd voice
L1Norm 735.6 603.3 601.5
L2Norm 5.91 4.99 5.02
SNR 2.33 2.65 2.86

Table 6: Data chart for Mr. “E”.

Person →
Cal↓
Voice of MR.”A” Voice of MR.”B” Voice of MR.”C” Voice of MR.”D”
L1Norm
L2Norm
SNR
1st 2nd 3rd
481.8 587.5 570.1
3.84 5.05 4.78
6.81 6.00 5.81
1st 2nd 3rd
789.9 471.0 559.8
7.06 3.81 4.90
7.40 7.68 7.06
1st 2nd 3rd
463.7 561.7 665.4
3.54 4.46 5.77
7.36 6.64 6.00
1st 2nd 3rd
450.3 690 680
3.26 5.70 5.23
6.90 6.38 5.73

Table 7a: Data chart for all of speech analysis.

In our study we obtain code form wavelet coding and then the code is simulated using MATLAB. From the results we noticed that the performance of Wavelet Coding which can detect the distinguish between low bit rate speech and high bit rate speech (Figure 10); also can distinguish between high vocal speech and low vocal speech. Four men (A,B,C,D,E, male) participate in this experiment with different age and voice. Here we take 18(Eighteen) experimental voice via headphone and for the resultant discursion we calculate only L1,L2 Norm,Therehold value and SNR. All of experiments are given in below (Figure 11).

applied-computational-mathematics-original-speech-mr-b

Figure 10: Original speech signal of Mr. “B” with 10 level decomposition and decomposition tree.
L1 Norm = 471.0
L2 Norm = 471.0
Signal to noise ratio = 7.68 db

applied-computational-mathematics-original-speech-mr-b

Figure 11: Original speech signal of Mr.“B” with 10 level decomposition and decomposition tree
L1 Norm = 559.8
L2 Norm = 4.904
Signal to noise ratio = 7.06 db

What is Voice Recognition and why is it Useful in a Virtual Environment?

Voice recognition is "the technology by which sounds, words or phrases spoken by humans are converted into electrical signals, and these signals are transformed into coding patterns to which meaning has been assigned" (Figure 12). While the concept could more generally be called "sound recognition", we focus here on the human voice because we most often and most naturally use our voices to communicate our ideas to others in our immediate surroundings (Figure 13). In the context of a virtual environment Table 7b, the user would presumably gain the greatest feeling of immersion, or being part of the simulation, if they could use their most common form of communication, the voice (Figure 14). The difficulty in using voice as an input to a computer simulation lies in the fundamental differences between human speech and the more traditional forms of computer input. While computer programs are commonly designed to produce a precise and well-defined response upon receiving the proper (and equally precise) input, the human voice and spoken words are anything but precise. Each human voice is different, and identical words can have different meanings if spoken with different inflections or in different contexts. Several approaches have been tried, with varying degrees of success, to overcome these difficulties (Figure 15).

Person→
Cal↓
Voice of Mr.”E”
L1Norm
L2Norm
SNR
1st 2nd 3rd
735.6 603.3 601.5
5.91 4.99 5.024
2.33 2.65 2.86

Table 7b: Data chart for all of speech analysis.

applied-computational-mathematics-three-experimental-mr-b

Figure 12: Three experimental voice data chart of Mr.“B".

applied-computational-mathematics-original-speech-mr-c

Figure 13: Original speech signal of Mr. “C” with 10 level decomposition and decomposition tree.
L1 Norm = 463.7
L2 Norm = 3.546
Signal to noise ratio =7.367 db

applied-computational-mathematics-original-speech-mr-c

Figure 14: Original speech signal of Mr. “C” with 10 level decomposition and decomposition tree.
L1 Norm = 561.7
L2 Norm = 4.467
Signal to noise ratio =6.64 db

applied-computational-mathematics-original-speech-mr-c

Figure 15: Original speech signal of Mr “C” with 10 level decomposition and decomposition tree.
L1 Norm = 665.4, L2 Norm = 5.774
Signal to noise ratio =6.00 db

LP Norm:

For finite p, Lp No1rm in c[a,b] is defined as

Equation

For discreate function it can be defined as

Equation

Where {xi} are the components of f .

If we put p =1 in the above equation then 1 || f || is called L1 Norm.

If we put p = 2 in the above equation then 2 || f || is called L2 Norm.

Signal to Noise Ratio (SNR)

This value gives the quality of reconstructed signal. Higher the value, the better:

Equation

σ x2 Is the mean square of the speech signal and σe2 is the mean square difference between the origiand reconstructed signals (Figure 16).

applied-computational-mathematics-three-experimental-mr-c

Figure 16: Three experimental voice data chart of Mr. “C”.

Experiment 1: Original speech signal of Mr. “A”:

Coding of speech signal:

Mr. “A” is talking with headphone: Hello, Hello, Hello, Hello, Hello, Hello, and Hello (7 Times)

Experiment 2: Original speech signal of Mr. “A”:

Coding of speech signal:

Mr. “A” is talking with headphone: Hello, Hello, Hello, Hello, Hello, Hello, and Hello (7 Times)

Experiment 3: Original speech signal of Mr. “A”:

Coding of speech signal:

Mr. “A” is talking with headphone: Hello, Hello, Hello, Hello, Hello, Hello, Hello, and Hello (8 Times) (Figure 17).

applied-computational-mathematics-original-speech-mr-d

Figure 17: Original speech signal of Mr. “D” with 10 level decomposition and decomposition tree.
L1 Norm = 450.3, L2 Norm = 3.26
Signal to noise ratio =6.906 db

Summary

From the above chart it is concluded that the SNR value is minimum for the high bit rate voice (fast voice) than that the slow bit rate voice (slow voice) within the same decibel (db) value. If we increase the volume of vocal chord in different cases then the result will be change (Figure 18). With the increase of volume of vocal chord, the value of L1 norm and L2 norm is increase respectively. For a same range bit rate voices with different volume, SNR value may be increased or decreased but L1 and L2 norm must increased with the increase of volume. By this way we can say that the 2nd speech has the height volume i.e height db value among the 3 speech and the 3rd Speech is faster voice than the other two (Figure 19).

applied-computational-mathematics-original-speech-mr-d

Figure 18: Original speech signal of Mr. “D” with 10 level decomposition and decomposition tree.
L1 Norm = 690.0, L2 Norm = 5.708
Signal to noise ratio =6.38 db

applied-computational-mathematics-original-speech-mr-d

Figure 19: Original speech signal of Mr. “D” with 10 level decomposition and decomposition tree.
L1 Norm = 680.0, L2 Norm = 5.23
Signal to noise ratio =5.73 db

Experiment 4: Original speech signal of Mr. “A”

Coding of speech signal:

Mr. “A” is talking with headphone: Hello, Hello, Hello, Hello, Hello, Hello, Hello, and Hello (8 Times)

Experiment 5: Original speech signal of Mr. “A”

Coding of speech signal:

Mr. “A” is talking with headphone: Hello, Hello, Hello, Hello, Hello, Hello, Hello, Hello (8 Times with loud enough)

Experiment 6: Original speech signal of Mr. “A”

Coding of speech signal:

Mr. “A” is talking with headphone: Hello, Hello, Hello, Hello, Hello, Hello, Hello, (7 Times with loudly) (Figure 20).

applied-computational-mathematics-three-experimental-mr-b

Figure 20:Three experimental voice data chart of Mr. “D”.

Summary

From the above chart it is concluded that the SNR value is minimum for the high bit rate voice(fast voice) than that the slow bit rate voice(slow voice) within the same decibel (db) value. If we increase the volume of vocal chord in different cases then the result will change. With the increase of volume of vocal chord the value of L1 norm and L2 norm is increase respectively (Figure 21). For a same range bit rate voices with different volume, SNR value may be increased or decreased but L1 and L2 norm must increased with the increase of volume. In this result we can observed that the 3rd speech has the high volume i.e high db value among the 3 speech and the 1st Speech is faster voice than the other two (Figure 22).

applied-computational-mathematics-original-speech-mr-e

Figure 21: Original speech signal of Mr. “E” with 10 level decomposition and decomposition tree.
L1 Norm = 735.6, L2 Norm = 5.91
Signal to noise ratio =2.33 db

applied-computational-mathematics-original-speech-mr-e

Figure 22: Original speech signal of Mr. “E” with 10 level decomposition and decomposition tree.
L1 Norm = 603.3, L2 Norm = 4.991
Signal to noise ratio =2.65 db

Experiment 7: Original speech signal of Mr. “B”

Coding of speech signal:

Mr. “B” is talking with headphone: Hello, Hello, Hello, Hello, Hello, Hello, Hello, Hello (8 Times)

Experiment 8: Original speech signal of Mr. “B”:

Coding of speech signal:

Mr. “B” is talking with headphone: Hello, Hello, Hello, Hello, Hello, Hello, (6 Times)

Experiment 9: Original speech signal of Mr. “B”

Coding of speech signal:

Mr. “B” is talking with headphone: Hello, Hello, Hello, Hello, Hello, Hello, Hello, and Hello (6 Times with loud enough) (Figure 23).

applied-computational-mathematics-original-speech-mr-e

Figure 23: Original speech signal of Mr. “E” with 10 level decomposition and decomposition tree.
L1 Norm = 601.5, L2 Norm = 5.024
Signal to noise ratio =2.86 db

Summary

From the above chart it is concluded that the SNR value is minimum for the high bit rate voice (fast voice) than that the slow bit rate voice (slow voice) within the same decibel (db) value. If we increase the volume of vocal chord in different cases then the result will change. With the increase of volume of vocal chord the value of L1 norm and L2 norm is increase respectively. For a same range bit rate voices with different volume, SNR value may be increased or decreased but L1 and L2 norm must increased with the increase of volume. By this way we can say that the 1st speech has the high volume i.e high db value among the 3 speech and the 3rd Speech is faster voice than the other two (Figures 24 and 25).

applied-computational-mathematics-three-experimental-mr-d

Figure 24: Three experimental voice data chart of Mr. “E”.

applied-computational-mathematics-original-speech-mr-e

Figure 25: Original speech signal of Mr. “E” with 10 level decomposition and decomposition tree.
L1 Norm = 601.5, L2 Norm = 5.024
Signal to noise ratio =2.86 db

Experiment 10: Original speech signal of Mr. “C”

Coding of speech signal:

Mr. “C” is talking with headphone: Hello, Hello, Hello, Hello, Hello, Hello, and Hello (7 Times)

Experiment 11: Original speech signal of Mr. “C”:

Coding of speech signal:

Mr. “C” is talking with headphone: Hello, Hello, Hello, Hello, Hello, Hello, and Hello (7 Times)

Experiment 12: Original speech signal of Mr. “C”

Coding of speech signal:

Mr. “C” is talking with headphone: Hello, Hello, Hello, Hello, Hello, Hello, Hello, Hello (8 Times with loud enough)

Summary

From the above chart it is concluded that the SNR value is minimum for the high bit rate voice (fast voice) than that the slow bit rate voice (slow voice) within the same decibel (db) value. If we increase the volume of vocal chord in different cases then the result will change. With the increase of volume of vocal chord the value of L1 norm and L2 norm is increase respectively. For a same range bit rate voices with different volume, SNR value may be increased or decreased but L1 and L2 norm must increased with the increase of volume. By this way we can say that the 3rd speech has the high volume i.e high db value among the 3 speech and the 3rd Speech is faster voice than the other two.

Experiment 13: Original speech signal of Mr. “D”

Coding of speech signal:

Mr. “D” is talking with headphone: Hello, Hello, Hello, Hello, Hello, Hello, and Hello (7 Times)

Experiment 14: Original speech signal of Mr. “D”

Coding of speech signal:

Mr. “D” is talking with headphone: Hello, Hello, Hello, Hello, Hello, Hello, and Hello (7 Times with loud enough)

Experiment 15: Original speech signal of Mr. “D”

Coding of speech signal:

Mr. “D” is talking with headphone: Hello, Hello, Hello, Hello, Hello, Hello, Hello, and Hello (8 Times with loud enough)

Summary

From the above chart it is concluded that the SNR value is minimum for the high bit rate voice (fast voice) than that the slow bit rate voice (slow) within the same decibel (db) value. If we increase the volume of vocal chord in different cases then the result will change. With the increase of volume of vocal chord the value of L1 norm and L2 norm is increase respectively. For a same range bit rate voices with different volume, SNR value may be increased or decreased but L1 and L2 norm must increased with the increase of volume. By this way we can say that the 3rd speech has the high volume i.e high db value among the speech and the 3rd Speech is faster voice than the other two.

Experiment 16: Original speech signal of Mr. “E”

Coding of speech signal:

Mr. “E” is talking with headphone: Hello, Hello, Hello, Hello, Hello, Hello, Hello, and Hello (8 Times with loud enough)

Experiment 17: Original speech signal of Mr. “E”

Coding of speech signal:

Mr. “E” is talking with headphone: Hello, Hello, Hello, Hello, Hello, Hello, Hello, and Hello (7 Times)

Experiment 18: Original speech signal of Mr. “E”

Coding of speech signal:

Mr. “E” is talking with headphone: Hello, Hello, Hello, Hello, Hello, Hello, Hello, and Hello (5.5 Times)

Summary

From the above chart it is concluded that the SNR value is minimum for the high bit rate voice (fast voice) than that the slow bit rate voice (slow voice) within the same decibel (db) value. If we increase the volume of vocal chord in different cases then the result will change. With the increase of volume of vocal chord the value of L1 norm and L2 norm is increase respectively. For a same range bit rate voices with different volume, SNR value may be increased or decreased but L1 and L2 norm must increased with the increase of volume. By this way we can say that the 1st speech has the high volume i.e high db value among the 3 speech and the 1st Speech is faster voice than the other two.

Results

As shown in Table 6 a speech files spoken in English language is recorded for only male. The effects of varying threshold value on the speech signals in terms of SNR and compression score were observed for different cases. There are many factors which affects the wavelet based speech Coder’s performance, mainly what compression ratio could be achieved at suitable SNR value. To improve the compression ratio of wavelet-based coder, we have to consider that it is highly speaker dependent and varies with his age and gender. That is low speaking speed because high compression ratio with high value of SNR and high speaking speed cause low compression ratio with low value of SNR. Also the detection of volume depends on the value of L1 and L2 Norm. The High volume speaking cause high L1 and L2 Norm on the other hand low volume speaking cause low L1 and L2 Norm respectively. All of signals are analyzed in Haar Wavelet.

Conclusion

We are tried to observe voice activity or detect the voice activity by calculating and comparing the mathematical term L1 , L2 Norm. In this study we also calculated SNR for each signal. In this study we have analyze different speech signal by Haar wavelet with 10 decomposition level. Performance of the wavelet coder is tested on male speech signals of duration 5 Sec. Results illustrate that with the help of wavelet we can analysis and detects the voice activity. For getting accurate more result it is require to high machineries and farther study.

References

Select your language of interest to view the total content in your interested language
Post your comment

Share This Article

Article Usage

  • Total views: 13095
  • [From(publication date):
    June-2015 - Sep 23, 2019]
  • Breakdown by view type
  • HTML page views : 9210
  • PDF downloads : 3885
Top