Fakultas Teknologi Informasi

Breast cancer identification using a hybrid machine learning system

Penulis
Dosen:
  1. TONI ARIFIN
  2. IGN. WISETO PRASETYO AGUNG
  3. ERFIAN JUNIANTO
Mahasiswa:
  1. Dari Dianata Agustin
  2. Ilham Rachmat Wibowo
Tanggal Terbit
31 Agustus 2025
Kategori
Jurnal Internasional Bereputasi [Q3]
Penerbit
International Journal of Advances in Applied Sciences (IJAAS)
Kota / Negara
Daerah istimewa yogyakarta / Indonesia
Volume
15
Halaman
3928
ISSN
2088-8708
E-ISSN
2722-2578
DOI
https://10.11591/ijece.v15i4.pp3928-3937
URL
https://ijece.iaescore.com/index.php/IJECE/index
Abstrak
Breast cancer remains one of the most prevalent malignancies among women and is frequently diagnosed at an advanced stage. Early detection is critical to improving patient prognosis and survival rates. Messenger ribonucleic acid (mRNA) gene expression data, which captures the molecular alterations in cancer cells, offers a promising avenue for enhancing diagnostic accuracy. The objective of this study is to develop a machine learning-based model for breast cancer detection using mRNA gene expression profiles. To achieve this, we implemented a hybrid machine learning system (HMLS) that integrates classification algorithms with feature selection and extraction techniques. This approach enables the effective handling of heterogeneous and high-dimensional genomic data,
such as mRNA expression datasets, while simultaneously reducing dimensionality without sacrificing critical information. The classification algorithms applied in this study include support vector machine (SVM), random forest (RF), naïve Bayes (NB), k-nearest neighbors (KNN), extra trees classifier (ETC), and logistic regression (LR). Feature selection was conducted using analysis of variance (ANOVA), mutual information (MI), ETC, LR, whereas principal component analysis (PCA) was employed for feature extraction. The performance of the proposed model was evaluated using standard metrics, including recall, F1-score, and accuracy. Experimental results demonstrate that the combination of the SVM classifier with MI feature selection outperformed other configurations and conventional machine learning approaches, achieving a classification accuracy of 99.4%.