Klasifikasi Spam pada Short Message Service (SMS) menggunakan Support Vector Machine

Klasifikasi Spam pada Short Message Service (SMS) menggunakan Support Vector Machine

Penulis

  • Mutiharis Dauber Panjaitan Universitas Brawijaya
  • Putra Pandu Adikara
  • Budi Darma Setiawan

Kata Kunci:

klasifikasi teks, SMS, spam, support vector machine

Abstrak

Short Message Service (SMS) adalah layanan pesan singkat yang secara luas digunakan dalam berbagai aktivitas sehari-hari, termasuk pemantauan kesehatan, mobile banking, dan mobile commerce. Namun, SMS juga rentan terhadap penyalahgunaan yang dapat mengandung konten berbahaya. Pesan-pesan SMS spam dapat bercampur dengan pesan-pesan non-spam, sehingga mengganggu pengguna. Oleh karena itu, diperlukan pengelompokan pesan menjadi beberapa kategori untuk memudahkan pengguna. Dalam penelitian ini, kategori yang digunakan adalah normal, penipuan, promo, autentikasi, dan bank. Data yang digunakan berjumlah 1584 pesan, yang dibagi menjadi data latih dan data uji dengan perbandingan 75%:25%. Klasifikasi pesan dilakukan menggunakan metode SVM dengan konsep one-against-all. Penelitian ini melibatkan preprocessing, term weighting, training, dan testing. Hasil evaluasi menunjukkan tingkat akurasi sebesar 0,95, precision sebesar 0,96, recall sebesar 0,95, dan f1-score sebesar 0,95. Hasil ini diperoleh dengan menggunakan kombinasi parameter C = 100, epsilon = 10-5, konstanta gamma = 0,01, lambda = 0,1, dan iterasi maksimum = 50.  

Kata kunci: klasifikasi teks, SMS, spam,  support vector machine

Referensi

Bhatnagar, S. dan Kumar, A. (2018) “A Rule-Based Classification of Short Message Service Type,” 2018 2nd International Conference on Inventive Systems and Control (ICISC), (Icisc), hal. 1139–1142. Tersedia pada: https://doi.org/10.1109/ICISC.2018.8398982.

Bheemesh, K.R. dan Deepa, N. (2020) “Accurate SMS Spam Detection Using Support Vector Machine In Comparison With Linear Regression,” 2023 Fifth International Conference on Electrical, Computer and Communication Technologies (ICECCT), hal. 1–4. Tersedia pada: https://doi.org/10.1109/ICECCT56650.2023.10179827.

Dewi, F.K. et al. (2017) “Multiclass SMS Message Categorization : Beyond Spam Binary Classification,” 2017 International Conference on Advanced Computer Science and Information Systems (ICACSIS), hal. 210–215. Tersedia pada: https://doi.org/10.1109/ICACSIS.2017.8355035.

Dharani, V., Hegde, D. dan Mohana (2023) “Spam SMS ( or ) Email Detection and Classification using Machine Learning,” 2023 5th International Conference on Smart Systems and Inventive Technology (ICSSIT), (Icssit), hal. 1104–1108. Tersedia pada: https://doi.org/10.1109/ICSSIT55814.2023.10060908.

Ebora, J.G.O., Español, J.C.N. dan Padilla, D.A. (2022) “Text Classification of Facebook Messages Using Multiclass Support Vector Machine,” 2022 13th International Conference on Computing Communication and Networking Technologies, ICCCNT 2022, hal. 1–6. Tersedia pada: https://doi.org/10.1109/ICCCNT54827.2022.9984554.

Gadde, S., Lakshmanarao, A. dan Satyanarayana, S. (2021) “SMS Spam Detection using Machine Learning and Deep Learning Techniques,” 2021 7th International Conference on Advanced Computing and Communication Systems, ICACCS 2021, hal. 358–362. Tersedia pada: https://doi.org/10.1109/ICACCS51430.2021.9441783.

Jain, T. et al. (2022) “SMS Spam Classification Using Machine Learning Techniques,” 2022 12th International Conference on Cloud Computing, Data Science & Engineering (Confluence), hal. 273–279. Tersedia pada: https://doi.org/10.1109/Confluence52989.2022.9734128.

Juwiantho, H. et al. (2020) “Sentiment Analysis Twitter Bahasa Indonesia Berbasis WORD2VEC Menggunakan Deep Convolutional Neural Network,” Jurnal Teknologi Informasi dan Ilmu Komputer, 7(1), hal. 181–188. Tersedia pada: https://doi.org/10.25126/jtiik.202071758.

Kalcheva, N., Karova, M. dan Penev, I. (2020) “Comparison of the accuracy of SVM kemel functions in text classification,” Proceedings of the International Conference on Biomedical Innovations and Applications, BIA 2020, hal. 141–145. Tersedia pada: https://doi.org/10.1109/BIA50171.2020.9244278.

Ma’arif, M.R. (2016) “Integrasi Laman Web Tentang Pariwisata Daerah Istimewa Yogyakarta Memanfaatkan Teknologi Web Scraping Dan Text Mining,” Teknomatika, 9(1), hal. 71–80. Tersedia pada: http://eprints.binadarma.ac.id/3554/1/Ma%27arif2016 TEKNOMATIKA %5BIntegrasi Laman Web Tentang Pariwisata Daerah Istimewa Yogyakarta Memanfaatkan Teknologi Web Scraping Dan Text Mining%5D.pdf.

Moattar, M.H., Homayounpour, M.M. dan Zabihzadeh, D. (2006) “Persian Text Normalization using Classification Tree and Support Vector Machine,” hal. 1308–1311. Tersedia pada: https://doi.org/10.1109/ictta.2006.1684569.

Muftie, F. dan Haris, M. (2023) “IndoBERT Based Data Augmentation for Indonesian Text Classification,” 2023 International Conference on Information Technology Research and Innovation (ICITRI), hal. 128–132. Tersedia pada: https://doi.org/10.1109/ICITRI59340.2023.10250061.

Philip, J. et al. (2023) “A Comparative Study of Text Classification using Selective Machine Learning Algorithms,” hal. 482–484. Tersedia pada: https://doi.org/10.1109/iciccs56967.2023.10142474.

Popovac, M. et al. (2018) “Convolutional Neural Network based SMS Spam Detection,” 2018 26th Telecommunications Forum (TELFOR), hal. 1–4.

Putra, P.R.B., Indriati dan Perdana, R.S. (2023) “Klasifikasi Judul Berita Onlinemenggunakan Metode Support Vector Machine(SVM) dengan Seleksi Fitur Chi-square.pdf,” Jurnal Pengemb, 7(5), hal. 2132–2141.

Saxena, N., Chaudhari, N.S. dan Member, S. (2014) “EasySMS : A Protocol for End-to-End Secure Transmission of SMS,” IEEE Transactions on Information Forensics and Security, 9(7), hal. 1157–1168. Tersedia pada: https://doi.org/10.1109/TIFS.2014.2320579.

Shafi’i, M. et al. (2017) “A Review on Mobile SMS Spam Filtering Techniques,” IEEE Access, 5, hal. 15650–15666. Tersedia pada: https://doi.org/10.1109/ACCESS.2017.2666785.

Souza, R.D. et al. (2011) “Protocol Implementation for Short Message Service over IP,” 2011 6th International Conference on Industrial and Information Systems, hal. 443–447. Tersedia pada: https://doi.org/10.1109/ICIINFS.2011.6038110.

Truecaller (2021) Truecaller Insights: Top 20 Countries Affected By Spam Calls In 2021. Tersedia pada: https://www.truecaller.com/blog/insights/top-20-countries-affected-by-spam-calls-in-2021 (Diakses: 10 September 2023).

Vijayakumar, S. dan Wu, S. (1999) “Sequential Support Vector Classi ers and Regression 1 Abstract 2 Introduction,” Proc. International Conference on Soft Computing, ((SOCO’99),Genoa, Italy), hal. 610–619.

Wake, N. et al. (2023) “ChatGPT Empowered Long-Step Robot Control in Various Environments : A Case Application,” IEEE Access, 11(August), hal. 95060–95078. Tersedia pada: https://doi.org/10.1109/ACCESS.2023.3310935.

Zhanguo, M. et al. (2011) “Improved terms weighting algorithm of text,” Proceedings - 2011 International Conference on Network Computing and Information Security, NCIS 2011, 2, hal. 367–370. Tersedia pada: https://doi.org/10.1109/NCIS.2011.171.

Unduhan

Diterbitkan

04 Jul 2024

Cara Mengutip

Panjaitan, M. D., Adikara, P. P., & Setiawan, B. D. (2024). Klasifikasi Spam pada Short Message Service (SMS) menggunakan Support Vector Machine. Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, 8(5). Diambil dari https://j-ptiik.ub.ac.id/index.php/j-ptiik/article/view/13771

Terbitan

Bagian

Artikel
Loading...