Comparison of agglomerative hierarchical clustering (AHC) algorithm and k-means algorithm in poverty data clustering in north sumatra

Wilia Usna , Rima Aprilia

Abstract


North Sumatra had the 17th lowest rate of poverty in 2023 out of 34 provinces, with 1,239.71 thousand people, or 8.15 percent, living there. Although there has been a decline in the poverty rate in 2023 compared to previous years, there are still many districts and cities in North Sumatra with significant rates of poverty; thus, this cannot be disregarded. The government must act to address this by providing the community with various forms of aid and increasing the number of job openings. To overcome this, one must first identify the cities or districts with the lowest to highest rates of poverty. This can be avoided with data mining, namely by applying the clustering technique. The Agglomerative Hierarchical Clustering (AHC) algorithm and the K-Means algorithm were the clustering techniques employed in this investigation. The Davies Bouldin Index (DBI) will then be used to validate the clustering results in order to ascertain which technique yields the best cluster. Three clusters were created using the AHC method: cluster 1 had 31 districts/cities, cluster 2 had one district/city, and cluster 3 had one district/city. Using the k-means approach, three clusters were identified: cluster 1, which included 22 districts/cities, had the lowest poverty rate; cluster 2, which included 10 districts/cities, had a moderate poverty rate; and cluster 3, which included 1 district/city, had the highest poverty rate. It was discovered through clustering validation that the k means method with a DBI value of 0.45 was the most effective approach for this investigation.


Keywords


Clustering; Agglomerative Hierarchical Clustering Algorithm; K-Means Algorithm; Davies Bouldin Index Validation.

Full Text:

PDF

References


Amna, S, W., Sudipa, I. G. I., Putra, T. A. E., Wahidin, A. J., Syukrilla, W. A., … Santoso, L. W. (2023). Data mining. PT Global Eksekutif Teknologi.

Aprilia, R., Afsari, K., Rahma, R., Nasution, N., Ouri, S., & Putri, D. (2022). Analisis cluster dengan metode k-means cluster pada jenis data surat di bpprd sumatera utara. Jurnal Pengabdian Kepada Masyarakat, 6(2).

Asyfani, Y., Nur, I. M., Amri, I. F., Yunanita, N., Lestari, F. A., Hisani, Z. A., & Rohim, F. H. N. (2024). Pengelompokan kabupaten/kota di jawa tengah berdasarkan kepadatan penduduk menggunakan metode hierarchical clustering. Journal of Data Insights, 2(1), 1–8.

Badan Pusat Statistik. (2021). Profil kemiskinan provinsi sumatera utara. In Badan Pusat Statistik Provinsi Sumatera Utara.

Faran, J., & Aldisa, R. T. (2023). Analisis data mining dalam komparasi average linkage ahc dan k-means clustering untuk dataset facebook live sellers. JURNAL MEDIA INFORMATIKA BUDIDARMA, 7(4), 2041. https://doi.org/10.30865/mib.v7i4.6892

Fathurrahman, F., Harini, S., & Kusumawati, R. (2023). Evaluasi clustering k-means dan k-medoid pada persebaran covid-19 di indonesia dengan metode davies-bouldin index (dbi). Jurnal Mnemonic, 6(2), 117–128. https://doi.org/10.36040/mnemonic.v6i2.6642

Fikri, R., Mushardiyanto, A., Laudza’Banin, M. N., Maureen, K., & Patria, H. (2021). Pengelompokan kabupaten/kota di indonesia berdasarkan informasi kemiskinan tahun 2020 menggunakan metode k-means clustering analysis. Seminar Nasional Teknik Dan Manajemen Industri, 1(1), 190–199. https://doi.org/10.28932/sentekmi2021.v1i1.76

Hidayat, F. P., Putra, R. P., Alfitrah, M. D., & Widodo, E. (2023). Implementasi clustering k-medoids dalam pengelompokan kabupaten di provinsi aceh berdasarkan faktor yang mempengaruhi kemiskinan. Indonesian Journal of Applied Statistics, 5(2), 121. https://doi.org/10.13057/ijas.v5i2.55080

Latupeirissa, S. J., Lewaherilla, N., & Hiariey, A. (2022). Pengelompokkan kabupaten/kota di provinsi maluku berdasarkan data kemiskinan tahun 2021 mengunakan metode k- means cluster. Variance, 4, 15–22.

Luchia, N. T., Handayani, H., Hamdi, F. S., Erlangga, D., & Octavia, S. F. (2022). Perbandingan k-means dan k-medoids pada pengelompokan data miskin di indonesia. MALCOM: Indonesian Journal of Machine Learning and Computer Science, 2(2). https://doi.org/10.57152/malcom.v2i2.422

Luthfi, E., & Wijayanto, A. W. (2021). Analisis perbandingan metode hirearchical, k-means, dan k-medoids clustering dalam pengelompokkan indeks pembangunan manusia Indonesia. INOVASI, 17(4), 761–773. https://doi.org/10.30872/jinv.v17i4.10106

Manurung, J., Sari Ramadhan, P., & Suryanata, M. (2020). Perbandingan algoritma k-means dan k-medoids untuk pengelompokkan data masyarakat miskin pada kantor camat hatonduhan stmik triguna dharma. Jurnal CyberTech, 3(9).

R, N. N. F., Anggraeni, D. S., & Enri, U. (2022). Pengelompokkan data kemiskinan provinsi jawa barat menggunakan algoritma k-means dengan silhouette coefficient. TEMATIK, 9(1). https://doi.org/10.38204/tematik.v9i1.901

Riska, S. Y., & Farokhah, L. (2023). Perbandingan hasil evaluasi algoritma k-means dan k-medoid berdasarkan kunjungan wisatawan mancanegara ke indonesia. INTEGER: Journal of Information Technology, 8.

Riswanda, G. P., Kusnandar, D., & Imro’ah, N. (2023). Perbandingan klaster k-means dan k-median pada data indikator kemiskinan kabupaten/kota di provinsi kalimantan barat. BIMASTER: Buletin Ilmiah Math. Stat. Dan Terapannya, 12(6), 537–544.

Sachrrial, R. H., & Iskandar, A. (2023). Analisa perbandingan complate linkage ahc dan k-medoids dalam pengelompokkan data kemiskinan di indonesia. Building of Informatics, Technology and Science (BITS), 5(2). https://doi.org/10.47065/bits.v5i2.4310

Saputra, N. (2021). Metodelogi penelitan kuantitatif. In Pascal Books (Vol. 11).

Saputri, F. W., & Arianto, D. B. (2023). Perbandingan performa algoritma k-means, k- medoids, dan dbscan dalam penggerombolan provinsi di indonesia berdasarkan indikator kesejahteraan masyarakat. Jurnal Teknologi Informasi: Jurnal Keilmuan Dan Aplikasi Bidang Teknik Informatika, 7(2), 138–151. https://doi.org/10.47111/jti.v7i2.9558

Septiani, I. W., Fauzan, Abd. C., & Huda, M. M. (2022). Implementasi algoritma k-medoids dengan evaluasi davies-bouldin- index untuk klasterisasi harapan hidup pasca operasi pada pasien penderita kanker paru-paru. Jurnal Sistem Komputer Dan Informatika (JSON), 3(4), 556. https://doi.org/10.30865/json.v3i4.4055

Sujjada, A., Insany, G. P., & Noer, S. (2024). Analisis clustering data penyandang disabilitas menggunakan metode agglomerative hierarchical clustering dan k-means. Jurnal Teknologi Dan Manajemen Informatika, 10(1), 1–12. https://doi.org/10.26905/jtmi.v10i1.10654

Tempola, F., Muhammad, M., & Mubarak, A. (2020). Penggunaan internet dikalangan siswa sd di kota ternate: Suatu survey, penerapan algoritma clustering dan validasi dbi. Jurnal Teknologi Informasi Dan Ilmu Komputer, 7(6). https://doi.org/10.25126/jtiik.2020722370

Tuhpatussania, S., Erniwati, S., & Mutaqin, Z. (2024). Perbandingan metode agglomerative hierarchical clustering dan metode k medoids dalam pengelompokkan data titik panas. Journal Computer and Technology, 2(1), 21–38.

Umagapi, I. T., Umaternate, B., Komputer, S., Pasca Sarjana Universitas Handayani, P., Kepegawaian Daerah Kabupaten Pulau Morotai, B., & Riset dan Inovasi, B. (2023). Uji kinerja k-means clustering menggunakan davies-bouldin index pada pengelompokan data prestasi siswa. Seminar Nasional Sistem Informasi Dan Teknologi (SISFOTEK), 7(1).

Yusri, A. Z. (2020). Teori, metode dan praktik penelitian kualitatif. Jurnal Ilmu Pendidikan, 7(2).

Zuhal, N. K. (2022). Study comparison k-means clustering dengan algoritma hierarchical clustering. Prosiding Seminar Nasional Teknologi Dan Sains, 1.




DOI: http://dx.doi.org/10.24042/djm.v7i3.24373

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Desimal: Jurnal Matematika

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

  Creative Commons License
Desimal: Jurnal Matematika is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.