Using Machine learning to predict Heart diseases

Researchers recently created a machine learning-based heart disease prediction model (ML-HDPM) that makes use of multiple approved classification techniques and information combinations, and they reported their findings in Scientific Reports.

Context

Healthcare practitioners must assess and treat heart disease, a global health issue, using modern imaging methods, diagnostic procedures, and medical examinations. Encouraging heart-healthy behaviors and early detection can reduce the incidence of cardiovascular disease and improve overall health.

Though they have drawbacks including overfitting and unequal diagnostic accuracy, current methods like machine learning, deep learning, and sensor-based data collecting yield encouraging results.

To improve the diagnosis and prognosis of cardiac disease, the suggested methods make use of modern technology and feature selection processes.

About the study

The ML-HDPM model was developed in the current study to accurately predict heart disease.

For the purpose of gathering cardiovascular data, the researchers accessed databases from Hungary, Switzerland, Long Beach, and Cleveland. Clinical data was pre-processed, and then features were chosen, extracted, cluster-based oversampling, and classified.

To obtain the desired feature, they computed importance scores, removed the lowest feature scores, and fitted the model with the feature set using training data.

In order to ascertain whether the termination requirement was met, the genetic algorithm (GA) included population initialization, selection, crossover, and mutation.

In order to integrate the training set and apply synthetic minority over-sampling (SMOTE) to produce model output, the researchers undersampled raw data samples with majority labels and clustered samples with minority labels.

Recursive feature elimination method (RFEM) and genetic algorithm (GA) are used by the model to choose important features, hence increasing the resilience of the model. Data imbalances are corrected by methods like the under-sampling clustering oversampling approach (USCOM).

Multiple-layer deep convolutional neural networks (MLDCNN) and the adaptive elephant herd optimization technique (AEHOM) are used for the classification challenge.

The model classifiers were naïve Bayes (NB), decision tree (DT), random forest (RF), linear discriminant analysis (LDA), principal component analysis (PCA), and support vector machines (SVM).

The model combines an improved weighted random forest method with guided infinite feature selection. Pre-processing with ML-HDPM ensures both data quality and model effectiveness. Predictive modeling properties are uncovered through extensive feature selection.

While SMOTE adjusts for class imbalance, a scalar technique produces a consistent feature impact. The genetic algorithm generates several answers in a single generation by utilizing the principles of natural selection.

Through simulated testing, the effectiveness of the technique is evaluated and contrasted with current models. Data from the training, validation, and testing datasets made up 10%, 10%, and 80% of the total, respectively.

Results

The thorough analysis shows that ML-HDPM fared well over a broad spectrum of crucial evaluation criteria. The ML-HDPM model made a 96% accurate and 95% precise prediction of cardiovascular disease using training data.

96% accuracy was obtained from the system’s sensitivity (recall), and 92% F-scores demonstrated its well-balanced performance. Notable is the 90% ML-HDPM specificity.

Results from ML-HDPM are trustworthy and precise. Complex technologies including deep learning, data balancing, feature selection, and adaptive elephant herding optimization (AEHOM) are all incorporated into it. By utilizing these tactics, the model may accurately predict the development of cardiac disease, leading to better patient outcomes and clinical judgments.

In testing (88%), ML-HDPM performs better than other algorithms (95%) and throughout training. The mix of machine learning, data imbalance adjustments, and complicated feature extraction is what makes the result successful.

Algorithms for feature selection make it possible to identify important characteristics linked to cardiovascular health, which in turn helps identify subtle trends that may indicate cardiovascular disease.

Model training on representative datasets is ensured by data correction through effective data balancing approaches, which also include deep learning through the MLDCNN approach and AEHOM optimization to boost model accuracy.

Deep learning model ML-HDPM provides better machine learning components, better feature selections, and lower false-positive rates (FPR) in testing (15%) and training (8.20%) than other methods.

The model’s high true-positive rates (TPR) in the testing (91%) and training (96%) datasets were brought about by advancements in deep learning, feature identification, and data balance. The method enhances the model’s ability to detect real positives.

Conclusion

In order to enhance the prediction of cardiovascular illness, the study offers a novel ML-HDPM approach that combines feature choices, data balancing, and machine learning.

The training and testing datasets’ balanced F-values for accuracy and recall, high accuracy and precision rates, and low false-positive rates demonstrate the model’s promising promise for cardiovascular diagnostic applications.

The results show that the ML-HDPM model can improve the standard of treatment by enhancing the accuracy and speed of cardiovascular disease identification.

To enhance model optimization and data quality and explore its application by healthcare practitioners in practical contexts, more research is necessary.

Source link