Deteksi Mutasi Epidermal Growth Factor Receptor pada Kanker Paru Menggunakan Extreme Gradient Boosting

Deteksi Mutasi Epidermal Growth Factor Receptor pada Kanker Paru Menggunakan Extreme Gradient Boosting

Penulis

Kata Kunci:

mutasi EGFR, kanker paru, data klinis, pembelajaran mesin, XGBoost

Abstrak

Kanker paru adalah salah satu jenis kanker paling umum di Indonesia. Mutasi pada gen epidermal growth factor receptor (EGFR) berperan penting dalam menentukan strategi pengobatan, tetapi deteksinya terkendala teknologi seperti polymerase chain reaction (PCR) dan biaya yang tinggi. Penelitian ini bertujuan mengembangkan model berbasis Extreme Gradient Boosting (XGBoost) untuk deteksi mutasi EGFR yang lebih efisien dan terjangkau. Dataset berasal dari rekam medis pasien kanker paru di Rumah Sakit Dr. Saiful Anwar Malang (2018-2019) dan mencakup hasil tes mutasi serta morfologi kanker. Data diproses menggunakan KNN Imputer untuk missing value, IQR untuk outlier, seleksi fitur dengan feature importance XGBoost, dan resampling dengan SMOTE. Model dioptimalkan menggunakan grid search dengan hyperparameter terbaik: gamma 0, learning rate 0,3, max depth 3, n estimators 50, dan reg lambda 1. Hasil menunjukkan akurasi rata-rata 0,844 dan AUC 0,945 pada validasi serta akurasi dan AUC sempurna, yaitu 1 pada data uji. Model ini juga menonjolkan 10 fitur penting, termasuk metastasis tulang, stadium kanker, dan lain sebagainya. Model XGBoost yang dioptimalkan diharapkan membantu deteksi dini dan meningkatkan aksesibilitas pengobatan kanker paru di Indonesia.

Referensi

Alibrahim, H., & Ludwig, S. A. (2021). Hyperparameter Optimization: Comparing Genetic Algorithm against Grid Search and Bayesian Optimization. 2021 IEEE Congress on Evolutionary Computation (CEC), 1551–1559. https://doi.org/10.1109/CEC45853.2021.9504761

Asmara, O. D., Tenda, E. D., Singh, G., Pitoyo, C. W., Rumende, C. M., Rajabto, W., Ananda, N. R., Trisnawati, I., Budiyono, E., Thahadian, H. F., Boerma, E. C., Faisal, A., Hutagaol, D., Soeharto, W., Radityamurti, F., Marfiani, E., Romadhon, P. Z., Kholis, F. N., Suryadinata, H., … van Geffen, W. H. (2023). Lung Cancer in Indonesia. Journal of Thoracic Oncology, 18(9), 1134–1145. https://doi.org/10.1016/j.jtho.2023.06.010

Berger, A., & Kiefer, M. (2021). Comparison of Different Response Time Outlier Exclusion Methods: A Simulation Study. Frontiers in Psychology, 12. https://doi.org/10.3389/fpsyg.2021.675558

Blanquero, R., Carrizosa, E., Ramírez-Cobo, P., & Sillero-Denamiel, M. R. (2021). Variable selection for Naïve Bayes classification. Computers & Operations Research, 135, 105456. https://doi.org/10.1016/j.cor.2021.105456

Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. https://doi.org/10.1145/2939672.2939785

Garg, A., & Mago, V. (2021). Role of machine learning in medical research: A survey. Computer Science Review, 40, 100370. https://doi.org/10.1016/j.cosrev.2021.100370

Juna, A., Umer, M., Sadiq, S., Karamti, H., Eshmawi, A. A., Mohamed, A., & Ashraf, I. (2022). Water Quality Prediction Using KNN Imputer and Multilayer Perceptron. Water, 14(17), 2592. https://doi.org/10.3390/w14172592

Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T. Y. (2017). LightGBM: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems, 2017-Decem(Nips), 3147–3155.

Le, N. Q. K., Kha, Q. H., Nguyen, V. H., Chen, Y.-C., Cheng, S.-J., & Chen, C.-Y. (2021). Machine Learning-Based Radiomics Signatures for EGFR and KRAS Mutations Prediction in Non-Small-Cell Lung Cancer. International Journal of Molecular Sciences, 22(17), 9254. https://doi.org/10.3390/ijms22179254

Madeddu, C., Donisi, C., Liscia, N., Lai, E., Scartozzi, M., & Macciò, A. (2022). EGFR-Mutated Non-Small Cell Lung Cancer and Resistance to Immunotherapy: Role of the Tumor Microenvironment. International Journal of Molecular Sciences, 23(12), 6489. https://doi.org/10.3390/ijms23126489

Melosky, B., Kambartel, K., Häntschel, M., Bennetts, M., Nickens, D. J., Brinkmann, J., Kayser, A., Moran, M., & Cappuzzo, F. (2022). Worldwide Prevalence of Epidermal Growth Factor Receptor Mutations in Non-Small Cell Lung Cancer: A Meta-Analysis. Molecular Diagnosis & Therapy, 26(1), 7–18. https://doi.org/10.1007/s40291-021-00563-1

Njoto, E. N., Jasminarti Dwi Kusumawardani, I. A., & Rai, I. B. N. (2023). Predicting EGFR Mutation in Lung Adenocarcinoma: Development and Validation of the EGFR Mutation Predictive Score (EMPS) in Bali, Indonesia. Asian Pacific Journal of Cancer Prevention, 24(8), 2903–2910. https://doi.org/10.31557/APJCP.2023.24.8.2903

Nooreldeen, R., & Bach, H. (2021). Current and Future Development in Lung Cancer Diagnosis. International Journal of Molecular Sciences, 22(16), 8661. https://doi.org/10.3390/ijms22168661

Pradipta, G. A., Wardoyo, R., Musdholifah, A., Sanjaya, I. N. H., & Ismail, M. (2021). SMOTE for Handling Imbalanced Data Problem : A Review. 2021 Sixth International Conference on Informatics and Computing (ICIC), 1–8. https://doi.org/10.1109/ICIC54025.2021.9632912

Prusty, S., Patnaik, S., & Dash, S. K. (2022). SKCV: Stratified K-fold cross-validation on ML classifiers for predicting cervical cancer. Frontiers in Nanotechnology, 4. https://doi.org/10.3389/fnano.2022.972421

Qureshi, R., Basit, S. A., Shamsi, J. A., Fan, X., Nawaz, M., Yan, H., & Alam, T. (2022). Machine learning based personalized drug response prediction for lung cancer patients. Scientific Reports, 12(1), 18935. https://doi.org/10.1038/s41598-022-23649-0

To, K. K. W., Fong, W., & Cho, W. C. S. (2021). Immunotherapy in Treating EGFR-Mutant Lung Cancer: Current Challenges and New Strategies. Frontiers in Oncology, 11. https://doi.org/10.3389/fonc.2021.635007

Ünalan, S., Günay, O., Akkurt, I., Gunoglu, K., & Tekin, H. O. (2024). A comparative study on breast cancer classification with stratified shuffle split and K-fold cross validation via ensembled machine learning. Journal of Radiation Research and Applied Sciences, 17(4), 101080. https://doi.org/10.1016/j.jrras.2024.101080

Verma, R., Nagar, V., & Mahapatra, S. (2021). Introduction to Supervised Learning. In Data Analytics in Bioinformatics (pp. 1–34). Wiley. https://doi.org/10.1002/9781119785620.ch1

Vujovic, Ž. Ð. (2021). Classification Model Evaluation Metrics. International Journal of Advanced Computer Science and Applications, 12(6). https://doi.org/10.14569/IJACSA.2021.0120670

Wu, S., Shen, G., Mao, J., & Gao, B. (2020). CT Radiomics in Predicting EGFR Mutation in Non-small Cell Lung Cancer: A Single Institutional Study. Frontiers in Oncology, 10. https://doi.org/10.3389/fonc.2020.542957

Yang, R., Xiong, X., Wang, H., & Li, W. (2022). Explainable Machine Learning Model to Prediction EGFR Mutation in Lung Cancer. Frontiers in Oncology, 12. https://doi.org/10.3389/fonc.2022.924144

Zhang, C., Wang, D., Wang, L., Guan, L., Yang, H., Zhang, Z., Chen, X., & Zhang, M. (2021). Cause-aware failure detection using an interpretable XGBoost for optical networks. Optics Express, 29(20), 31974. https://doi.org/10.1364/OE.436293

Unduhan

Diterbitkan

20 Jan 2025

Cara Mengutip

Nurfansepta, A. G., Muflikhah, L., & Setiawan, B. D. (2025). Deteksi Mutasi Epidermal Growth Factor Receptor pada Kanker Paru Menggunakan Extreme Gradient Boosting. Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, 9(4). Diambil dari https://j-ptiik.ub.ac.id/index.php/j-ptiik/article/view/14728

Terbitan

Bagian

Artikel
Loading...