Penerapan Transformasi STFT (Short Time Fourier Transform) pada Koefisien Frekuensi Mel untuk Mengenali Tipe Vokal Suara
Kata Kunci:
Alto, Bass, Tenor, Sopran, STFT, CNN, Raspberry pi 4 Model BAbstrak
Tipe suara vokal merupakan jenis suara yang menjadi tantangan untuk diketahui oleh seorang penyanyi. Tipe suara vokal umumnya terbagi menjadi 4 kelas yakni alto, tenor, bass, dimana proses identifkasi dari setiap kelas umumnya menggunakan piano. Penelitian ini bertujuan untuk membuat alat pengklasifikasi tipe suara vokal yang dapat digunakan dimana saja dan kapan saja. Alat tersebut dirancang dengan menerapkan MFCC (Mel Frequency Cepctral Coeefficient) yang diekstrak melalui penerapan Algoritma STFT (Short Time Fourier Transform) serta diimplementasikan ke dalam Raspbbery Pi 4 Model B. Peneltian menggunakan dataset Esmuc-Choir yang telah dimodifikasi dengan durasi rekaman selama 3 detik. Setelah berhasil diekstraksi, tahapan klasifikasi untuk setiap kelas dilakukan dengan menggunakan Algoritma CNN (Convolutional Neural Network). Alat nantinya akan dioperasikan dengan menggunakan LCD dan GUI (Graphical User Interface). Hasil penelitian menunujkkan bahwa model CNN berhasil mencapai tingkat akurasi sebesar 97% sementara alat berhasil menunjukkan tingkat akurasi sebesar 65% dari 20 kali percobaan. Alat dan sistem secara keseluruhan bekerja dengan baik sehingga membuka potensi untuk pengembangan lebih lanjut.
Referensi
Abayomi-Alli, O. O., Damaševičius, R., Qazi, A., Adedoyin-Olowe, M., & Misra, S. (2022). Data Augmentation and Deep Learning Methods in Sound Classification: A Systematic Review. Electronics (Switzerland), 11(22). https://doi.org/10.3390/electronics11223795
Atahan, Y., Elbir, A., Enes Keskin, A., Kiraz, O., Kirval, B., & Aydin, N. (2021). Music Genre Classification Using Acoustic Features and Autoencoders. Proceedings - 2021 Innovations in Intelligent Systems and Applications Conference, ASYU 2021, 1–5. https://doi.org/10.1109/ASYU52992.2021.9598979
Brigham, E. O., & Morrow, R. E. (1967). The fast Fourier transform. IEEE Spectrum, 4(12), 63–70. https://doi.org/10.1109/MSPEC.1967.5217220
Chowdhury, A., & Ross, A. (2020). Fusing MFCC and LPC Features Using 1D Triplet CNN for Speaker Recognition in Severely Degraded Audio Signals. IEEE Transactions on Information Forensics and Security, 15, 1616–1629. https://doi.org/10.1109/TIFS.2019.2941773
Cooley, J. W., Lewis, P. A. W., & Welch, P. D. (1969). The Fast Fourier Transform and its Applications. IEEE Transactions on Education, 12(1), 27–34. https://doi.org/10.1109/TE.1969.4320436
Elbir, A., Ilhan, H. O., Serbes, G., & Aydin, N. (2018). Short Time Fourier Transform based music genre classification. 2018 Electric Electronics, Computer Science, Biomedical Engineerings’ Meeting, EBBT 2018, 1–4. https://doi.org/10.1109/EBBT.2018.8391437
Gamess, E., & Hernandez, S. (2022). Performance Evaluation of Different Raspberry Pi Models for a Broad Spectrum of Interests. International Journal of Advanced Computer Science and Applications, 13(2), 819–829. https://doi.org/10.14569/IJACSA.2022.0130295
Hang, C., Zhuang, M., Bai, T., & Sun, K. (2022). Research on Audio Recognition and Optimization Processing based on Deep Learning. Proceedings - 2022 3rd International Conference on Electronic Communication and Artificial Intelligence, IWECAI 2022, 91–96. https://doi.org/10.1109/IWECAI55315.2022.00026
Huang, J., Chen, B., Yao, B., & He, W. (2019). ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Network. IEEE Access, 7, 92871–92880. https://doi.org/10.1109/ACCESS.2019.2928017
Pratama, K. B., Suyanto, S., & Rachmawati, E. (2021). Human Vocal Type Classification using MFCC and Convolutional Neural Network. International Conference on Communication and Information Technology, ICICT 2021, 43–48. https://doi.org/10.1109/ICICT52195.2021.9568474
Rafiqo, D., Suyanto, Y., & Atmaji, C. (2022). Klasifikasi Suara Paru-Paru Berdasarkan Ciri MFCC. IJEIS (Indonesian Journal of Electronics and Instrumentation Systems), 12(1), 1. https://doi.org/10.22146/ijeis.70813
Suman, M., Harish, K., Kumar, K. M., & Samrajyam, S. (2015). Speech Recognition Using MFCC and VQLBG. International Journal of Advances in Applied Sciences, 4(4), 151. https://doi.org/10.11591/ijaas.v4.i4.pp151-156
Toshniwal, T., Tandon, P., & Nithyakani, P. (2022). Music Genre Recognition Using Short Time Fourier Tranform And CNN. 2022 International Conference on Computer Communication and Informatics, ICCCI 2022, just 1300, 1–4. https://doi.org/10.1109/ICCCI54379.2022.9740939
Yan, N., Ng, M. L., Man, M. K., & To, T. H. (2013). Vocal tract dimensional characteristics of professional male and female singers with different types of singing voices. International Journal of Speech-Language Pathology, 15(5), 484–491. https://doi.org/10.3109/17549507.2012.744429
Yani, K., Rizal, A., & Prasetya, B. (2008). Analisis Kinerja Algoritma Short Time Fourier Transform (Stft) Untuk Deteksi Sinyal Carrier Frequency Hopping Spread Spectrum (Fhss) Cdma. Seminar Sistem Informasi Indonesia (SESINDO2008, December, 2–3.
Abayomi-Alli, O. O., Damaševičius, R., Qazi, A., Adedoyin-Olowe, M., & Misra, S. (2022). Data Augmentation and Deep Learning Methods in Sound Classification: A Systematic Review. Electronics (Switzerland), 11(22). https://doi.org/10.3390/electronics11223795
Atahan, Y., Elbir, A., Enes Keskin, A., Kiraz, O., Kirval, B., & Aydin, N. (2021). Music Genre Classification Using Acoustic Features and Autoencoders. Proceedings - 2021 Innovations in Intelligent Systems and Applications Conference, ASYU 2021, 1–5. https://doi.org/10.1109/ASYU52992.2021.9598979
Brigham, E. O., & Morrow, R. E. (1967). The fast Fourier transform. IEEE Spectrum, 4(12), 63–70. https://doi.org/10.1109/MSPEC.1967.5217220
Chowdhury, A., & Ross, A. (2020). Fusing MFCC and LPC Features Using 1D Triplet CNN for Speaker Recognition in Severely Degraded Audio Signals. IEEE Transactions on Information Forensics and Security, 15, 1616–1629. https://doi.org/10.1109/TIFS.2019.2941773
Cooley, J. W., Lewis, P. A. W., & Welch, P. D. (1969). The Fast Fourier Transform and its Applications. IEEE Transactions on Education, 12(1), 27–34. https://doi.org/10.1109/TE.1969.4320436
Elbir, A., Ilhan, H. O., Serbes, G., & Aydin, N. (2018). Short Time Fourier Transform based music genre classification. 2018 Electric Electronics, Computer Science, Biomedical Engineerings’ Meeting, EBBT 2018, 1–4. https://doi.org/10.1109/EBBT.2018.8391437
Gamess, E., & Hernandez, S. (2022). Performance Evaluation of Different Raspberry Pi Models for a Broad Spectrum of Interests. International Journal of Advanced Computer Science and Applications, 13(2), 819–829. https://doi.org/10.14569/IJACSA.2022.0130295
Hang, C., Zhuang, M., Bai, T., & Sun, K. (2022). Research on Audio Recognition and Optimization Processing based on Deep Learning. Proceedings - 2022 3rd International Conference on Electronic Communication and Artificial Intelligence, IWECAI 2022, 91–96. https://doi.org/10.1109/IWECAI55315.2022.00026
Huang, J., Chen, B., Yao, B., & He, W. (2019). ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Network. IEEE Access, 7, 92871–92880. https://doi.org/10.1109/ACCESS.2019.2928017
Pratama, K. B., Suyanto, S., & Rachmawati, E. (2021). Human Vocal Type Classification using MFCC and Convolutional Neural Network. International Conference on Communication and Information Technology, ICICT 2021, 43–48. https://doi.org/10.1109/ICICT52195.2021.9568474
Rafiqo, D., Suyanto, Y., & Atmaji, C. (2022). Klasifikasi Suara Paru-Paru Berdasarkan Ciri MFCC. IJEIS (Indonesian Journal of Electronics and Instrumentation Systems), 12(1), 1. https://doi.org/10.22146/ijeis.70813
Suman, M., Harish, K., Kumar, K. M., & Samrajyam, S. (2015). Speech Recognition Using MFCC and VQLBG. International Journal of Advances in Applied Sciences, 4(4), 151. https://doi.org/10.11591/ijaas.v4.i4.pp151-156
Toshniwal, T., Tandon, P., & Nithyakani, P. (2022). Music Genre Recognition Using Short Time Fourier Tranform And CNN. 2022 International Conference on Computer Communication and Informatics, ICCCI 2022, just 1300, 1–4. https://doi.org/10.1109/ICCCI54379.2022.9740939
Yan, N., Ng, M. L., Man, M. K., & To, T. H. (2013). Vocal tract dimensional characteristics of professional male and female singers with different types of singing voices. International Journal of Speech-Language Pathology, 15(5), 484–491. https://doi.org/10.3109/17549507.2012.744429
Yani, K., Rizal, A., & Prasetya, B. (2008). Analisis Kinerja Algoritma Short Time Fourier Transform (Stft) Untuk Deteksi Sinyal Carrier Frequency Hopping Spread Spectrum (Fhss) Cdma. Seminar Sistem Informasi Indonesia (SESINDO2008, December, 2–3.
Diterbitkan
Cara Mengutip
Terbitan
Bagian
Lisensi
Hak Cipta (c) 2024 Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer
Artikel ini berlisensiCreative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.