Classification of Mental Disorders Using Modified Balanced Random Forest And Feature Selection
Abstract
This study employs the Modified Balanced Random Forest (MBRF) algorithm and Correlation-based Feature Selector (CfsSubsetEval) for mental disorder classification. The "Mental Disorder Classification" dataset from Kaggle was used with the aim of improving accuracy, evaluating feature selection, and assessing MBRF's performance in handling data imbalance. The study compares the performance of Random Forest (RF) and MBRF, and examines the impact of feature selection using CFS on mental disorder classification. The results indicate that MBRF outperforms RF with an 8.33% improvement in accuracy, 8.61% in precision, 8.33% in recall, and 9.08% in F1-Score. Additionally, the comparison between MBRF and MBRF with CFS reveals that while accuracy and recall remain the same, MBRF achieves 0.23% higher precision and 0.81% higher F1-Score than MBRF with CFS. In conclusion, the use of MBRF proves to be superior to the standard RF in addressing data imbalance for mental disorder classification, significantly improving accuracy, precision, recall, and F1-Score. However, feature selection with CFS does not significantly enhance performance. While accuracy and recall remain unchanged, MBRF without CFS demonstrates higher precision and F1-Score, indicating that the model performs better without feature selection in maintaining the balance between precision and recall.