한국생산제조학회 학술지 영문 홈페이지
[ Papers ]
Journal of the Korean Society of Manufacturing Technology Engineers - Vol. 30, No. 5, pp.366-371
ISSN: 2508-5107 (Online)
Print publication date 15 Oct 2021
Received 24 Aug 2021 Revised 06 Oct 2021 Accepted 08 Oct 2021
DOI: https://doi.org/10.7735/ksmte.2021.30.5.366

STFT를 사용한 음성 신호 기반의 감정 분류 연구

신영하a ; 송 규a ; 윤찬녕a ; 조우진a ; 박형주a ; 장동영a, *
A Classification Study on the Emotional Recognition Based on the Speech Signals Using STFT
Young-ha Shina ; Kyu Songa ; Chan-nyeong Yuna ; Woo-jin Choa ; Hyung-joo Parka ; Dong-young Janga, *
aKorea Electronics-Machinery Convergence Technology Institute

Correspondence to: *Tel.: +82-02-970-7094 E-mail address: dyjang@kemcti.re.kr (Dong-young Jang).

Abstract

The human voice has various characteristics, such as, loudness, pitch, speaking rate, etc. This research presents the classification method of human emotions using voice signals transformed using the short-time Fourier transform (STFT). The STFT can know the frequency component at a desired time point which can be verified using three criteria. Using the 1st criteria, that is, the frequency of the maximum sound intensity (MSI), the emotions can be classified into two groups normal/angry and happy. It is impossible to distinguish between the emotions using the 2nd criteria, which is, the dwell time of the MSI. Using the 3rd criteria, that is, the onset of the MSI, the two groups normal, and angry/happy are identified. Therefore, the 1st and 3rd criteria can be used to classify three emotions. These results can provide valuable insight for future research on the classification of human emotions.

Keywords:

Emotional speech, Speech recognition, Emotion recognition, Speech signals

Acknowledgments

이 연구는 산업통상자원부의 재원으로 한국산업기술진흥원의 지원을 받아 수행한 연구 과제입니다(No. S2640869).

References

  • Park, C. H., Sim, J. Y., Lee, D. W., Sim, K. B., 2001, Analyzing the Element of Emotion Recognition from Speech, Proc. Korean Inst. Intell. Syst. Conf., 199-202.
  • Kim, J. M., Kwon, C. H., 2014, Qualitative Classification of Voice Quality of Normal Speech and Derivation of its Correlation with Speech Features, Phonetics and Speech Sciences, 6:1 71-76. [https://doi.org/10.13064/KSSS.2014.6.1.071]
  • Lee, J. I., Kang, H. G., 2013, On the Importance of Tonal Features for Speech Emotion Recognition, J. Broadcast Eng., 18:5 713-721. [https://doi.org/10.5909/JBE.2013.18.5.713]
  • Kim, Y. G., Bae, Y. C., 2000, Design of Emotion Recognition Model Using Fuzzy Logic, Proc. Korean Inst. Intell. Syst. Conf.,, 268-282.
  • Moriyama, T., Oazwa, S., 1999, Emotion Recognition and Synthesis System on Speech, Proc. IEEE Intl. Conference on Multimedia Computing and Systems, 840-844.
  • Kim, J. H., Lee, S. P., 2021, Multi-modal Emotion Recognition using Speech Features and Text Embedding, Trans. Korean. Inst. Elect. Eng., 70:1 108-113. [https://doi.org/10.5370/KIEE.2021.70.1.108]
  • Ekman, P., Friesen, W. V., 1982, Emotion in the Human Face, 2nd Ed., Cambridge University Press, New York, USA.
  • Plutchik, R., 2003, Emotions and Life: Perspectives from Psychology, Biology, and Evolution, APA Books, USA.
  • Timothy, J. L., 2019, viewed 10 September 2019, Big Feels and How to Talk About Them, <https://www.healthline.com/health/list-of-emotions, >.
  • Kim, M. S., Moon. J. S., 2019, Speaker Verification Model Using Short-Time Fourier Transform and Recurrent Neural Network, Journal of The Korea Institute of Information Security & Cryptology, 29:6 1393-1401. [https://doi.org/10.13089/JKIISC.2019.29.6.1393]
  • Cha, Y. T., Lee, Y. H., Choi, S. J., 2020, Simulation of EI Shaping using STFT for Manipulator Vibration Reduction of Special-purpose Equipment Responding Disaster, Proc. Korean Soc. Manuf. Technol. Eng. Autumn Conf., 153-153.
  • Ingale, R., 2014, Harmonic Analysis Using FFT and STFT, International Journal of Signal Processing, Image Processing and Pattern Recognition, 7:4 345-362. [https://doi.org/10.14257/ijsip.2014.7.4.33]
  • Berglund, B., Lindvall., T., Schwela, D. H., 1995, viewed 13 October 2021, Guidelines for Community Noise, World Health Organization <https://www.who.int/docstore/peh/noise/Comnoise-1.pdf>.
Young-ha Shin

Researcher / Korea Electronics Machinery Convergence Technology Institute.

E-mail: sin0312222@kemcti.re.kr

Kyu Song

Associate Research Engineer / Korea Electronics Machinery Convergence Technology Institute.

E-mail: songkue@kemcti.re.kr

Chan-nyeong Yun

Associate Research Engineer / Korea Electronics Machinery Convergence Technology Institute.

E-mail: ycindia@kemcti.re.kr

Woo-jin Cho

Associate Research Engineer / Korea Electronics Machinery Convergence Technology Institute.

E-mail: sin0312222@kemcti.re.kr

Hyung-joo Park

Researcher / Korea Electronics Machinery Convergence Technology Institute.

E-mail: gudwn0117@kemcti.re.kr

Dong-young Jang

Research Director / Korea Electronics Machinery Convergence Technology Institute.

E-mail: dyjang@kemcti.re.kr