Data Mining Applications to address Cardiovascular Health Issues

Article contributed by Mr. Anurag Bhatt, Assistant Professor, Faculty of Technology and Computer Applications

There is a tremendous growth in the Healthcare industry in past few years enabling the effective provision of best possible medical facilities in terms of cost as well as time. It is the need of the hour for Healthcare industry needs to redefine the standards and bring the analytical approach to new heights. Various researchers are discovering new approaches to effectively provide the solutions regarding prediction of various cardiovascular health issues at multiple levels using data mining. There is a difference between Predictive analytics and the descriptive and prescriptive analytics. Predictive analytics takes the medical records, reports etc of the patients as input thereby analyzing them using various advanced machine learning algorithms and statistical methods. There are various recent researches going on using data mining techniques, predictive analytical tools and techniques in order to analyze the cardiovascular health issues and to predict the future outcomes of the analysis effectively and efficiently. Thus, analyzing and examining the futuristic work and possibilities is extremely needed for the better understanding and applications of more effective and hybrid algorithms.

Knowledge Extraction majorly includes the process of Pattern Recognition Data Extraction. A lot of data is available from a number of organizations in today’s world, showing the record of each activity or event occurred. This enormous amount of data can be processed and used to extract the actual meaning or knowledge by reducing data sizes and by gracing us with immense knowledge. Medical Informatics and Clinical research is one of the unexplored fields that can yield valuable information to replenish medical services with knowledge for betterment of medical services. According to WHO (World Health Organization) annual report, cardiovascular diseases (CVDs) are the number 1 cause of death globally, including estimated 17.9 million lives every year. Every 4 out of 5 CVD deaths are due to heart attacks and strokes.

Fig.1. Data Mining, Predictive Analytics and Predictive Modeling relationship

Previously, data mining techniques were deployed through unsupervised learning in order to discover hidden patterns i.e. without assuming the basic nature and structure of data. Heart diseases including SCD (Sudden Cardiac Death) are one of the fatal diseases in India. Various Data Mining Techniques and Algorithms are being applied to the CVD data obtained from patient’s health reports and an intelligent system is being be developed to predict early cardiovascular health issues, thereby minimizing the risk of SCA (Sudden Cardiac Arrest) and SCD (Sudden Cardiac Death) and other severe cardiovascular issues. Predictive analytics has a salient role to forecast the results and various probabilistic crucial cardiac health patterns.

In previous time, cardiovascular disease was considered as an old age problem. But we have witnessed in recent time that cardiovascular diseases is the leading cause of death in both young and old aged people. In today’s society, improper nutrition, mental stress and modern lifestyle including regular intake of alcohol and tobacco, are considered as one of the most effective reasons behind having these cardiovascular diseases, even in young adults. There is an adequate requirement of various test cases and knowledge to predict the chances of occurrence of heart diseases. Intelligent data mining algorithms are deployed over these test cases and medical datasets.

Fig.2. Data Analytics steps

Knowledge extraction is one of the clear approaches that allow us to identify those hidden patterns required for finding patterns in the cardiovascular disease data. Now-a-days, there are multiple approaches to use decision based model. Scientists and researchers are deploying genetic algorithms, associative classification techniques and Naïve Bayes technique etc. in order to get the desired results. The performance of the algorithms also needs to be evaluated, on the basis of which efficient algorithm will be associated with particular dataset. Renowned researcher Abhisek Taneja has proposed J48 pruned algorithm and Naïve Bayes algorithm for the data sets taken from UCI Machine Learning Repository. The researcher has proposed the methodology in which data analysis has been done in two ways: using selected attributes and using all attributes. First, attribute selection method is taken as in data preprocessing phase and attributes are then run through J48 and Naïve Bayes Algorithm with taking both as selected attributes and all attributes.


  • Springer Nature Singapore Pte Ltd. 2018, M. Pant et al. (eds.), Soft Computing: Theories and Applications, Advances in Intelligent Systems and Computing 584,
  • Taneja, A.: Heart Disease Prediction System Using Data Mining Techniques, Oriental Journal of Computer Science and technology, Vol. 6, No. 4, pp. 457-466, December 2013.
  • Jabbar, M.A., Chandra, P., Deekshatulu, B.L.: Heart Disease Prediction System using  Associative Classification and Genetic Algorithm, International Conference on Emerging Trends in Electrical, Electronics and Communication technologies-ICECIT, 2012.
  • Patil, R.R.: Heart Disease Prediction system using Naïve Bayes and Jelinek-mercer smoothing, International Journal of Advance Research in Computer and Communication Engineering, ISSN: 2278-1021, Volume 3, Issue 5, pp. 6787-6792, May 2014.

Leave a Reply

Your email address will not be published.