CLASSIFYING VILLAGE FUND IN WEST JAVA, INDONESIA USING CATBOOST ALGORITHM
Main Article Content
Abstract
Article Summary
With over 261 million inhabitants, Indonesia is home to approximately 15,000 villages, according to the Ministry of Villages, Disadvantaged Regions, and Transmigration. Among these, 1,406 are in West Java. Of these, 504 of them are advanced, 464 are developing, 390 are disadvantaged, and 48 are very disadvantaged. The CatBoost machine learning model was used to classify village funds in West Java from 2018 to 2021 and had an accuracy rating of 75%, precision rating of 79%, recall of 79%, and f1 score of 79%, demonstrating its excellent performance. However, missing data points had to be removed from the analysis and it is suggested that a more sophisticated method for handling missing values should be used in future studies. In addition, hyperparameter tuning could be employed to increase the model's performance, and a variety of metrics could be used to accurately assess the results. Overall, CatBoost may be of benefit to the Indonesian Government in order to classify village funds according to their status, channel funds more accurately and efficiently, and observe the situation of a village year-over-year.
Keywords
Article Keywords
Downloads
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC-BY 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
Kurniawan, H., de Groot, H.L. and Mulder, P., 2019. Are poor provinces catching‐up the rich provinces in Indonesia?. Regional Science Policy & Practice, 11(1), pp.89-108. DOI: https://doi.org/10.1111/rsp3.12160.
Bawono, I.R., 2019. Optimalisasi potensi desa di Indonesia. Gramedia Widiasarana Indonesia.
Bawono, I.R., 2019. Panduan penggunaan dan pengelolaan dana desa. Gramedia Widiasarana Indonesia.
Husmayanti, R., 2021. Tata Kelola Dana Desa Berbasis Perencanaan Partisipatif di Desa Pantai Cermin Kiri Kabupaten Serdang Bedagai. Jurnal Ilmiah Mahasiswa Ilmu Sosial dan Politik [JIMSIPOL], 1(3).
Sulastri, A., Khalifah, S., Lestari, D., Gustian, D., Muslih, M. and Rafaelevna, K.I., 2020. PRIORITY PROGRAM SELECTION OF VILLAGE FUND USING THE K-MEANS METHOD. INTERNATIONAL JOURNAL ENGINEERING AND APPLIED TECHNOLOGY (IJEAT), 3(2), pp.75-85. DOI: https://doi.org/10.52005/ijeat.v3i2.61.
Pranoto, G.T., Hadikristanto, W. and Religia, Y., 2022. Grouping of Village Status in West Java Province Using the Manhattan, Euclidean and Chebyshev Methods on the K-Mean Algorithm. JISA (Jurnal Informatika dan Sains), 5(1), pp.28-34. DOI: https://doi.org/10.31326/jisa.v5i1.1097.
Sianipar, V.V., Wanto, A. and Safii, M., 2020. Decision Support System for Determination of Village Fund Allocation Using AHP Method. The IJICS (International Journal of Informatics and Computer Science), 4(1), pp.20-28. DOI: http://dx.doi.org/10.30865/ijics.v4i1.2101.
F. Mahendra, A. M. Siregar, K. A. Baihaqi, B. Priyatna, and L. Setyani, 2023. Implementation Of Linear Regression Algorithm And Support Vector Regression In Building Prediction Models Fish Catches Of Fishermen In Ciparagejaya Village. Edutran Computer Science and Information Technology, 1(1).
Andayani, M., Yanti, N. and Lusiyanti, L., 2022. TECHNIQUE FOR ORDER PREFERENCE METHOD BY SIMILARITY TO IDEAL SOLUTION (TOPSIS) FOR DECISION SUPPORT SYSTEM IN DETERMINING THE PRIORITY FOR RECEIVING VILLAGE FUND ASSISTANCE. Jurnal Mantik, 6(1), pp.560-567.
Village Fund Data in Jawa Barat Indonesia. 2023. Available at: https://www.kaggle.com/datasets/eki1381/village-fund-data-in-jawa-barat (accessed Apr. 23, 2023).
Eilertz, D., Mitterer, M. and Buescher, J.M., 2022. automRm: an r package for fully automatic LC-QQQ-MS data preprocessing powered by machine learning. Analytical Chemistry, 94(16), pp.6163-6171. DOI: https://doi.org/10.1021/acs.analchem.1c05224.
Ahmad, T. and Aziz, M.N., 2019. Data preprocessing and feature selection for machine learning intrusion detection systems. ICIC Express Lett, 13(2), pp.93-101. DOI: https://doi.org/10.24507/icicel.13.02.93.
Tong, Y., Lu, K., Yang, Y., Li, J., Lin, Y., Wu, D., Yang, A., Li, Y., Yu, S. and Qian, J., 2020. Can natural language processing help differentiate inflammatory intestinal diseases in China? Models applying random forest and convolutional neural network approaches. BMC Medical Informatics and Decision Making, 20, pp.1-9. DOI: https://doi.org/10.1186/s12911-020-01277-w.
Cho, E., Chang, T.W. and Hwang, G., 2022. Data preprocessing combination to improve the performance of quality classification in the manufacturing process. Electronics, 11(3), p.477. DOI: https://doi.org/10.3390/electronics11030477.
Hussain, S., Mustafa, M.W., Jumani, T.A., Baloch, S.K., Alotaibi, H., Khan, I. and Khan, A., 2021. A novel feature engineered-CatBoost-based supervised machine learning framework for electricity theft detection. Energy Reports, 7, pp.4425-4436. DOI: https://doi.org/10.1016/j.egyr.2021.07.008.
Khanam, J.J. and Foo, S.Y., 2021. A comparison of machine learning algorithms for diabetes prediction. ICT Express, 7(4), pp.432-439. DOI: https://doi.org/10.1016/j.icte.2021.02.004.
Misra, P. and Yadav, A.S., 2019, March. Impact of preprocessing methods on healthcare predictions. In Proceedings of 2nd International Conference on Advanced Computing and Software Engineering (ICACSE). DOI: https://doi.org/10.2139/ssrn.3349586.
Kamran, M., 2021. A state of the art catboost-based T-distributed stochastic neighbor embedding technique to predict back-break at dewan cement limestone quarry. Journal of Mining and Environment, 12(3), pp.679-691. DOI: https://doi.org/10.22044/JME.2021.11222.2104.
Luo, M., Wang, Y., Xie, Y., Zhou, L., Qiao, J., Qiu, S. and Sun, Y., 2021. Combination of feature selection and catboost for prediction: The first application to the estimation of aboveground biomass. Forests, 12(2), p.216. DOI: https://doi.org/10.3390/f12020216.
Al Daoud, E., 2019. Comparison between XGBoost, LightGBM and CatBoost using a home credit dataset. International Journal of Computer and Information Engineering, 13(1), pp.6-10. DOI: doi.org/10.5281/zenodo.3607805.
Agustyaningrum, C.I., Gata, W., Nurfalah, R., Radiyah, U. and Maulidah, M., 2020. Komparasi Algoritma Naive Bayes, Random Forest Dan Svm Untuk Memprediksi Niat Pembelanja Online. Jurnal Informatika, 20(2). DOI: https://doi.org/10.30873/ji.v20i2.2402.
Religia, Y., Pranoto, G.T. and Santosa, E.D., 2020. South German Credit Data Classification Using Random Forest Algorithm to Predict Bank Credit Receipts. JISA (Jurnal Informatika dan Sains), 3(2), pp.62-66. DOI: https://doi.org/10.31326/jisa.v3i2.837.
Chairunisa, R. and Astuti, W., 2020. Perbandingan CART dan Random Forest untuk Deteksi Kanker berbasis Klasifikasi Data Microarray. Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), 4(5), pp.805-812. DOI: https://doi.org/10.29207/resti.v4i5.2083.
Sandy, W.K., Widodo, A.W. and Sari, Y.A., 2018. Penentuan Keaslian Tanda Tangan Menggunakan Shape Feature Extraction Techniques Dengan Metode Klasifikasi K Nearest Neighbor dan Mean Average Precision. Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer e-ISSN, 2548, p.964X.
Novianti, K.D.P., Setiawan, N.A. and Kusumawardani, S.S., 2015. Peningkatan Nilai Recall dan Precision pada Penelusuran Informasi Pustaka Berbasis Semantik (Studi Kasus: Sistem Informasi Ruang Referensi Jurusan Teknik Elektro dan Teknologi Informasi UGM). Proceedings Konferensi Nasional Sistem dan Informatika (KNS&I).

 
							 
						 
                					 
                			 
                			