Penerapan Transformasi STFT (Short Time Fourier Transform) pada Koefisien Frekuensi Mel untuk Mengenali Tipe Vokal Suara
Kata Kunci:
Alto, Bass, Tenor, Sopran, STFT, CNN, Raspberry pi 4 Model BAbstrak
Tipe suara vokal merupakan jenis suara yang menjadi tantangan untuk diketahui oleh seorang penyanyi. Tipe suara vokal umumnya terbagi menjadi 4 kelas yakni alto, tenor, bass, dimana proses identifkasi dari setiap kelas umumnya menggunakan piano. Penelitian ini bertujuan untuk membuat alat pengklasifikasi tipe suara vokal yang dapat digunakan dimana saja dan kapan saja. Alat tersebut dirancang dengan menerapkan MFCC (Mel Frequency Cepctral Coeefficient) yang diekstrak melalui penerapan Algoritma STFT (Short Time Fourier Transform) serta diimplementasikan ke dalam Raspbbery Pi 4 Model B. Peneltian menggunakan dataset Esmuc-Choir yang telah dimodifikasi dengan durasi rekaman selama 3 detik. Setelah berhasil diekstraksi, tahapan klasifikasi untuk setiap kelas dilakukan dengan menggunakan Algoritma CNN (Convolutional Neural Network). Alat nantinya akan dioperasikan dengan menggunakan LCD dan GUI (Graphical User Interface). Hasil penelitian menunujkkan bahwa model CNN berhasil mencapai tingkat akurasi sebesar 97% sementara alat berhasil menunjukkan tingkat akurasi sebesar 65% dari 20 kali percobaan. Alat dan sistem secara keseluruhan bekerja dengan baik sehingga membuka potensi untuk pengembangan lebih lanjut.
Abayomi-Alli, O. O., Damaševičius, R., Qazi, A., Adedoyin-Olowe, M., & Misra, S. (2022). Data Augmentation and Deep Learning Methods in Sound Classification: A Systematic Review. Electronics (Switzerland), 11(22).
Atahan, Y., Elbir, A., Enes Keskin, A., Kiraz, O., Kirval, B., & Aydin, N. (2021). Music Genre Classification Using Acoustic Features and Autoencoders. Proceedings - 2021 Innovations in Intelligent Systems and Applications Conference, ASYU 2021, 1–5.
Brigham, E. O., & Morrow, R. E. (1967). The fast Fourier transform. IEEE Spectrum, 4(12), 63–70.
Chowdhury, A., & Ross, A. (2020). Fusing MFCC and LPC Features Using 1D Triplet CNN for Speaker Recognition in Severely Degraded Audio Signals. IEEE Transactions on Information Forensics and Security, 15, 1616–1629.
Cooley, J. W., Lewis, P. A. W., & Welch, P. D. (1969). The Fast Fourier Transform and its Applications. IEEE Transactions on Education, 12(1), 27–34.
Elbir, A., Ilhan, H. O., Serbes, G., & Aydin, N. (2018). Short Time Fourier Transform based music genre classification. 2018 Electric Electronics, Computer Science, Biomedical Engineerings’ Meeting, EBBT 2018, 1–4.
Gamess, E., & Hernandez, S. (2022). Performance Evaluation of Different Raspberry Pi Models for a Broad Spectrum of Interests. International Journal of Advanced Computer Science and Applications, 13(2), 819–829.
Hang, C., Zhuang, M., Bai, T., & Sun, K. (2022). Research on Audio Recognition and Optimization Processing based on Deep Learning. Proceedings - 2022 3rd International Conference on Electronic Communication and Artificial Intelligence, IWECAI 2022, 91–96.
Huang, J., Chen, B., Yao, B., & He, W. (2019). ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Network. IEEE Access, 7, 92871–92880.
Pratama, K. B., Suyanto, S., & Rachmawati, E. (2021). Human Vocal Type Classification using MFCC and Convolutional Neural Network. International Conference on Communication and Information Technology, ICICT 2021, 43–48.
Rafiqo, D., Suyanto, Y., & Atmaji, C. (2022). Klasifikasi Suara Paru-Paru Berdasarkan Ciri MFCC. IJEIS (Indonesian Journal of Electronics and Instrumentation Systems), 12(1), 1.
Suman, M., Harish, K., Kumar, K. M., & Samrajyam, S. (2015). Speech Recognition Using MFCC and VQLBG. International Journal of Advances in Applied Sciences, 4(4), 151.
Toshniwal, T., Tandon, P., & Nithyakani, P. (2022). Music Genre Recognition Using Short Time Fourier Tranform And CNN. 2022 International Conference on Computer Communication and Informatics, ICCCI 2022, just 1300, 1–4.
Yan, N., Ng, M. L., Man, M. K., & To, T. H. (2013). Vocal tract dimensional characteristics of professional male and female singers with different types of singing voices. International Journal of Speech-Language Pathology, 15(5), 484–491.
Yani, K., Rizal, A., & Prasetya, B. (2008). Analisis Kinerja Algoritma Short Time Fourier Transform (Stft) Untuk Deteksi Sinyal Carrier Frequency Hopping Spread Spectrum (Fhss) Cdma. Seminar Sistem Informasi Indonesia (SESINDO2008, December, 2–3.
Abayomi-Alli, O. O., Damaševičius, R., Qazi, A., Adedoyin-Olowe, M., & Misra, S. (2022). Data Augmentation and Deep Learning Methods in Sound Classification: A Systematic Review. Electronics (Switzerland), 11(22).
Atahan, Y., Elbir, A., Enes Keskin, A., Kiraz, O., Kirval, B., & Aydin, N. (2021). Music Genre Classification Using Acoustic Features and Autoencoders. Proceedings - 2021 Innovations in Intelligent Systems and Applications Conference, ASYU 2021, 1–5.
Brigham, E. O., & Morrow, R. E. (1967). The fast Fourier transform. IEEE Spectrum, 4(12), 63–70.
Chowdhury, A., & Ross, A. (2020). Fusing MFCC and LPC Features Using 1D Triplet CNN for Speaker Recognition in Severely Degraded Audio Signals. IEEE Transactions on Information Forensics and Security, 15, 1616–1629.
Cooley, J. W., Lewis, P. A. W., & Welch, P. D. (1969). The Fast Fourier Transform and its Applications. IEEE Transactions on Education, 12(1), 27–34.
Elbir, A., Ilhan, H. O., Serbes, G., & Aydin, N. (2018). Short Time Fourier Transform based music genre classification. 2018 Electric Electronics, Computer Science, Biomedical Engineerings’ Meeting, EBBT 2018, 1–4.
Gamess, E., & Hernandez, S. (2022). Performance Evaluation of Different Raspberry Pi Models for a Broad Spectrum of Interests. International Journal of Advanced Computer Science and Applications, 13(2), 819–829.
Hang, C., Zhuang, M., Bai, T., & Sun, K. (2022). Research on Audio Recognition and Optimization Processing based on Deep Learning. Proceedings - 2022 3rd International Conference on Electronic Communication and Artificial Intelligence, IWECAI 2022, 91–96.
Huang, J., Chen, B., Yao, B., & He, W. (2019). ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Network. IEEE Access, 7, 92871–92880.
Pratama, K. B., Suyanto, S., & Rachmawati, E. (2021). Human Vocal Type Classification using MFCC and Convolutional Neural Network. International Conference on Communication and Information Technology, ICICT 2021, 43–48.
Rafiqo, D., Suyanto, Y., & Atmaji, C. (2022). Klasifikasi Suara Paru-Paru Berdasarkan Ciri MFCC. IJEIS (Indonesian Journal of Electronics and Instrumentation Systems), 12(1), 1.
Suman, M., Harish, K., Kumar, K. M., & Samrajyam, S. (2015). Speech Recognition Using MFCC and VQLBG. International Journal of Advances in Applied Sciences, 4(4), 151.
Toshniwal, T., Tandon, P., & Nithyakani, P. (2022). Music Genre Recognition Using Short Time Fourier Tranform And CNN. 2022 International Conference on Computer Communication and Informatics, ICCCI 2022, just 1300, 1–4.
Yan, N., Ng, M. L., Man, M. K., & To, T. H. (2013). Vocal tract dimensional characteristics of professional male and female singers with different types of singing voices. International Journal of Speech-Language Pathology, 15(5), 484–491.
Yani, K., Rizal, A., & Prasetya, B. (2008). Analisis Kinerja Algoritma Short Time Fourier Transform (Stft) Untuk Deteksi Sinyal Carrier Frequency Hopping Spread Spectrum (Fhss) Cdma. Seminar Sistem Informasi Indonesia (SESINDO2008, December, 2–3.
Cara Mengutip
Hak Cipta (c) 2024 Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer
Artikel ini berlisensiCreative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.