Model Predictive Analytics Terhadap Pasien Diabetes Menggunakan Exploratory Data Analysis dan Algoritma Random Forest

  • Hamid Muhammad Jumasa Muhammadiyah University of Purworejo
Keywords: Exploratory Data Analysis, EDA, Random Forest, Predictive Analytics

Abstract

Diabetes is one of the diseases that fall into the category of chronic (long-term) diseases. This disease is characterized by increased blood sugar (glucose) levels that exceed the normal threshold. As a result, the function of the insulin hormone in the body is disrupted.1In 2021, the International Diabetes Federation (IDF) noted that there were 537 million adults aged 20 - 79 years (Reza Pahlevi, 2021). Diabetes also causes 6.7 million deaths. Several factors that cause diabetes include being overweight, high cholesterol levels, lifestyle and not exercising and age. Until now, no medicine has been found that can treat this disease completely, so what needs to be done is to detect diabetes early to control the dangers of diabetes.1
This research will create a predictive analytics model to predict whether someone will develop diabetes. The data analysis technique used Exploratory Data Analysis (EDA) and the machine learning model used Random Forest. This research used data from the website Kaggle with a total of 769 people. The data consists of 9 columns with 7 data and 2 data.
After analyzing the sampling data, the accuracy of the training data was 0.998207 with a Mean Squared Error of 0.00179. Testing data obtained was 0.74603 with Mean Squared Error of 0.25396. The prediction results from 20 sample data tested, obtained 18 times the model made correct predictions and 2 times the model made incorrect predictions.

References

Apriliah, W., Kurniawan, I., Baydhowi, M., & Haryati, T. (2021). Prediksi Kemungkinan Diabetes pada Tahap Awal Menggunakan Algoritma Klasifikasi Random Forest (Vol. 10, Issue 1). http://sistemasi.ftik.unisi.ac.id

Ayu Mardhiyah, P., Ruli A. Siregar, R., & Palupiningsih, P. (2020). Klasifikasi Untuk Memprediksi Pembayaran Kartu Kredit Macet Menggunakan Algoritma C4.5. Jurnal Teknologia, 3(1), 91–101.

Erdiansyah, U., Irmansyah Lubis, A., & Erwansyah, K. (2022). Komparasi Metode K-Nearest Neighbor dan Random Forest Dalam Prediksi Akurasi Klasifikasi Pengobatan Penyakit Kutil. Jurnal Media Informatika Budidarma, 6(1), 208. https://doi.org/10.30865/mib.v6i1.3373

Hasan, I. K., Resmawan, R., & Ibrahim, J. (2022). Perbandingan K-Nearest Neighbor dan Random Forest dengan Seleksi Fitur Information Gain untuk Klasifikasi Lama Studi Mahasiswa. Indonesian Journal of Applied Statistics, 5(1), 58. https://doi.org/10.13057/ijas.v5i1.58056

Isnaini, N., & Ratnasari. (2018). Faktor Risiko Mempengaruhi Kejadian Diabetes Mellitus Tipe Dua. Jurnal Kebidanan Dan Keperawatan Aisyiyah,14(1),59–68. https://doi.org/10.31101/jkk.550

Karim, A. A., Ary Prasetyo, M., & Saputro, M. R. (2023). Perbandingan Metode Random Forest, K-Nearest Neighbor, dan SVM Dalam Prediksi Akurasi Pertandingan Liga Italia (Vol. 2). http://www.football-data.co.uk.

Nasution, F., Azwar Siregar, A., & Tinggi Kesehatan Indah Medan, S. (2021). Faktor Risiko Kejadian Diabetes Mellitus (Risk Factors for The Event of Diabetes Mellitus). Jurnal Ilmu Kesehatan, 9(2), 94–102.

Nur Ikhromr, F., Sugiyarto, I., Faddillah, U., & Sudarsono, B. (2023). Implementasi Data Mining Untuk Memprediksi Penyakit Diabetes Menggunakan Algoritma Naives Bayes dan K-Nearest Neighbor. Journal of Information Technology and Computer Science (INTECOMS), 6(1).

Radhi, M., Ryan Hamonangan Sitompul, D., Hamonangan Sinurat, S., & Indra, E. (2021). Analisis BIG DATA Dengan Metode Exploratory Data Analysis (EDA) Dan Metode Visualisasi Menggunakan Jupyter Notebook. Jurnal Sistem Informasi Dan Ilmu Komputer Prima, 4(2), 23–27.

Rahmi, I. A., Afendi, F. M., & Kurnia, A. (2023). Metode AdaBoost dan Random Forest untuk Prediksi Peserta JKN-KIS yang Menunggak. Jambura Journal of Mathematics, 5(1), 83–94. https://doi.org/10.34312/jjom.v5i1.15869

Sabariah, M. K., Mukharil Bachtiar, A., Dharmayanti, D., & Perdana, I. (2012). Jurnal Ilmiah Komputer dan Informatika (KOMPUTA) 49 Volume. I Nomor. 2, Bulan Oktober. http://bit.ly/kuesionerpasar

Sagita, P., Apriliana, E., Mussabiq, S., Soleha, T. U., & Dokter, P. (2021). Pengaruh Pemberian Daun Sirsak (Annona muricata) Terhadap Penyakit Diabetes Melitus. http://jurnalmedikahutama.com

Published
2023-11-27
How to Cite
Jumasa, H. M. (2023). Model Predictive Analytics Terhadap Pasien Diabetes Menggunakan Exploratory Data Analysis dan Algoritma Random Forest. INTEK : Jurnal Informatika Dan Teknologi Informasi, 6(2), 44-50. https://doi.org/10.37729/intek.v6i2.3867
Section
Articles