Analysis of Diabetic Patients' Data for Clustering and Prescription Drug Based on Proposed Algorithm

Document Type : Research Paper

Authors

1 PhD student, Department of Information Technology Management, Islamic Azad University Science and Research Branch, Tehran, Iran

2 Department of Information Technology Management, Science and Research Branch, Islamic Azad University, Tehran, Iran

3 Department of Information Technology Managemen tScience and Research Branch, Islamic Azad University,Tehran,Iran

4 Associate Professor, Department of Industrial Management, Central Tehran Branch, Islamic Azad University, Tehran, Iran

5 Department of management, Tarbiat Modares University, Tehran, Iran

Abstract

Introduction Diabetes is a metabolic disorder in the body that is impaired by the ability to produce insulin hormone. The main purpose of the present study is to discover the hidden knowledge in the data of diabetic patients, which can assist clinicians in clustering new patients and prescribing appropriate medication according to each cluster.
Methods: In this paper, we use MR-VDBSCAN algorithm. The implementation of this algorithm is based on the map-reduce framework of Hadoop. The main idea of the research is to use local density to find the density of each point. This strategy can prevent clusters from joining at different densities.
Results: The algorithm is based on the selected dataset, tested and evaluated, and the results show high accuracy and efficiency. The results were compared with the results of k-Means clustering, The MR-VDBSCAN algorithm has a higher execution speed than that of the algorithm and has the ability to detect clusters with different density of superiority of this algorithm than the comparable algorithm. The results show that the MR-VDBSCAN algorithm can provide better performance than other algorithms. In particular, the similarity of the proposed algorithm is 97% for the diabetes set.
Conclusion: The results show that the MR-VDBSCAN algorithm performs better clustering than the K-means algorithm and can place patients into subgroups that assist physicians in prescribing.

Keywords


1. Kordi F, Esfandi A, Hemmati F. Detection and Diagnosis of Diabetes Using Clustering and
Data Mining Techniques. The Second International Conference on Compounds, Cryptography
and Computing. 2017.
2. Zabah A, Eskandi Z, Sardari A, Noghandi A. Diagnosis of diabetes using artificial and neuralfuzzy neural network. Journal of Torbat-Heydariyeh University of Medical Sciences.
2018;6(2):11-19.
3. Ashuri M, Naji Moghaddam S, Alizadeh V, Safi M. Using Classification and Clustering
Algorithms to Predict the Number of Pills Taken: A Case Study of Diabetes. Health
Information Management.2013;10(5):739-748.
4. Durairaj M, Kalaiselvi G. Prediction of diabetes using soft computing techniques-A survey.
International journal of scientific & technology research. 2015 Mar;4(3):190-2.
5. Sharmila K, Manickam S. Diagnosing diabetic dataset using Hadoop and k-means clustering
techniques. Indian Journal of Science and Technology. 2016 Oct;9(40):1-5.
6. Ahmed KN, Razak TA. An overview of various improvements of DBSCAN algorithm in
clustering spatial databases. Int. J. Adv. Res. Comput. Commun. Eng.(IJARCCE). 2016
Feb;5(2):360-3.
7. Muni Kumar N, Manjula R. Role of Big data analytics in rural health care-A step towards
svasth bharath. International Journal of Computer Science and Information Technologies.
2014;5(6):7172-8.
8. Ahmadi P, Sultan Aghaei M. Implementing the Hadoop Code Framework and Examining the
Reduction Mapping Service. The First National Conference on New Ideas in Electrical
Engineering. 2012.
9. Sohrabi B, Hamideh I. Macro Data Management in the Private and Public Sectors. Samt
Publications.2015.
10. Han J, Pei J, Kamber M. Data mining: concepts and techniques. Elsevier; 2011 Jun 9.
11. Kalyankar GD, Poojara SR, Dharwadkar NV. Predictive analysis of diabetic patient data using machine learning and Hadoop. In2017 international conference on I-SMAC (IoT in social,
mobile, analytics and cloud)(I-SMAC) 2017 Feb 10 (pp. 619-624). IEEE.
12. Viceconti M, Hunter P, Hose R. Big data, big knowledge: big data for personalized healthcare.
IEEE journal of biomedical and health informatics. 2015 Feb 24;19(4):1209-15.
13. Sadhana SS, Shetty S. Analysis of diabetic data set using hive and r. International Journal of
Emerging Technology and Advanced Engineering. 2014 Jul;4(7):626-9.
14. `Biradar U, Mugali DS. Clustering Algorithms on Diabetes Data: Comparative Case Study.
International Journal of Advanced Research in Computer Science. 2017 May 1;8(5).
15. Ogbuabor G, Ugwoke FN. Clustering algorithm for a healthcare dataset using silhouette score
value. International Journal of Computer Science & Information Technology. 2018;10(2):27-
37.
16. Cho SB, Kim SC, Chung MG. Identification of novel population clusters with different
susceptibilities to type 2 diabetes and their impact on the prediction of diabetes. Scientific
reports. 2019 Mar 4;9(1):1-9.
17. Heidari S, Alborzi M, Radfar R, Afsharkazemi MA, Ghatari AR. Big data clustering with
varied density based on MapReduce. Journal of Big Data. 2019 Dec 1;6(1):77.
18. Song J, Guo C, Wang Z, Zhang Y, Yu G, Pierson JM. HaoLap: A Hadoop based OLAP system
for big data. Journal of Systems and Software. 2015 Apr 1;102:167-81.
19. Hashem IA, Yaqoob I, Anuar NB, Mokhtar S, Gani A, Khan SU. The rise of “big data” on
cloud computing: Review and open research issues. Information systems. 2015 Jan 1;47:98-
115.
20. He Y, Tan H, Luo W, Feng S, Fan J. MR-DBSCAN: a scalable MapReduce-based DBSCAN
algorithm for heavily skewed data. Frontiers of Computer Science. 2014 Feb 1;8(1):83-99.
21. Fu X, Wang Y, Ge Y, Chen P, Teng S. Research and application of DBSCAN algorithm based
on Hadoop platform. InJoint International Conference on Pervasive Computing and the
Networked World 2013 Dec 5 (pp. 73-87). Springer, Cham.
22. Lu CW, Hsieh CM, Chang CH, Yang CT. An improvement to data service in cloud computing
with content sensitive transaction analysis and adaptation. In2013 IEEE 37th Annual Computer
Software and Applications Conference Workshops 2013 Jul 22 (pp. 463-468). IEEE.
23. Dai BR, Lin IC. Efficient map/reduce-based dbscan algorithm with optimized data partition.
In2012 IEEE Fifth international conference on cloud computing 2012 Jun 24 (pp. 59-66).
IEEE.
24. Xiong Z, Chen R, Zhang Y, Zhang X. Multi-density DBSCAN algorithm based on density
levels partitioning. Journal of Information and Computational Science. 2012 Oct;9(10):2739-
49.