Breast cancer identification using a hybrid machine learning system
Penulis |
---|
Dosen: Mahasiswa: |
Tanggal Terbit |
31 Agustus 2025 |
Kategori |
Jurnal Internasional Bereputasi [Q3] |
Penerbit |
International Journal of Advances in Applied Sciences (IJAAS) |
Kota / Negara |
Daerah istimewa yogyakarta / Indonesia |
Volume |
15 |
Halaman |
3928 |
ISSN |
2088-8708 |
E-ISSN |
2722-2578 |
DOI |
https://10.11591/ijece.v15i4.pp3928-3937 |
URL |
https://ijece.iaescore.com/index.php/IJECE/index |
Abstrak |
Breast cancer remains one of the most prevalent malignancies among women and is frequently diagnosed at an advanced stage. Early detection is critical to improving patient prognosis and survival rates. Messenger ribonucleic acid (mRNA) gene expression data, which captures the molecular alterations in cancer cells, offers a promising avenue for enhancing diagnostic accuracy. The objective of this study is to develop a machine learning-based model for breast cancer detection using mRNA gene expression profiles. To achieve this, we implemented a hybrid machine learning system (HMLS) that integrates classification algorithms with feature selection and extraction techniques. This approach enables the effective handling of heterogeneous and high-dimensional genomic data, such as mRNA expression datasets, while simultaneously reducing dimensionality without sacrificing critical information. The classification algorithms applied in this study include support vector machine (SVM), random forest (RF), naïve Bayes (NB), k-nearest neighbors (KNN), extra trees classifier (ETC), and logistic regression (LR). Feature selection was conducted using analysis of variance (ANOVA), mutual information (MI), ETC, LR, whereas principal component analysis (PCA) was employed for feature extraction. The performance of the proposed model was evaluated using standard metrics, including recall, F1-score, and accuracy. Experimental results demonstrate that the combination of the SVM classifier with MI feature selection outperformed other configurations and conventional machine learning approaches, achieving a classification accuracy of 99.4%. |