Histogram to Sound Conversion: A Review

Himadri Nath Moulick1, Moumita Ghosh2, Poulomi Das3 , Chandan Koner4, and Alok Kumar Roy5
1.West Bengal University of Technology, India
2.Burdwan University, Burdwan, India
3.Heritage Institute of Technology, Kolakata, India
4.Dr. B.C.Roy Engineering College, Durgapur-713206, India
5.Bankura Unnayani Institute of Engineering, Bankura, India
Abstract—The main goal of a voice conversion system [1]-[6] is to modify the voice of a source speaker, in order to be perceived as if it had been uttered by another specific speaker. Many approaches found in the literature convert only the features related to the vocal tract of the speaker. Our proposal is to not only convert those characteristics of the vocal tract, but also to process the signal passing through the vocal chords. Thus, the goal of this work is to obtain better scores in the voice conversion results. Also, this paper describes a method of compensating for nonlinear distortions in speech representation caused by noise. The method described here is based on the histogram equalization method often used in digital image processing. Histogram equalization is applied to each component of the feature vector in order to improve the robustness of speech recognition systems. The paper describes how the proposed method can be applied to robust speech recognition and it is compared with other compensation techniques.

Index Terms—voice conversion, K-histograms, cepstral mean normalization, histogram equalization, mean and variance normalization.

Cite: Himadri Nath Moulick, Moumita Ghosh, Poulomi Das, Chandan Koner, and Alok Kumar Roy, "Histogram to Sound Conversion: A Review," Lecture Notes on Information Theory, Vol. 2, No. 3, pp. 226-231, September 2014. doi: 10.12720/lnit.2.3.226-231
