Feature selection and predicting chemotherapy-induced ulcerative mucositis using machine learning methods

Int J Med Inform. 2021 Oct:154:104563. doi: 10.1016/j.ijmedinf.2021.104563. Epub 2021 Aug 27.

Abstract

Objective: Ulcerative mucositis (UM) is a devastating complication of most cancer therapies with less recognized risk factors. Whilst risk predictions are most vital in adverse events, we utilized Machine learning (ML) approaches for predicting chemotherapy-induced UM.

Methods: We utilized 2017 National Inpatient Sample database to identify discharges with antineoplastic chemotherapy-induced UM among those received chemotherapy as part of their cancer treatment. We used forward selection and backward elimination for feature selection; lasso and Gradient Boosting Method were used for building our linear and non-linear models.

Results: In 2017, there were 253 (unweighted numbers) chemotherapy-induced UM patient discharges from 21,626 (unweighted numbers) adult patients who received antineoplastic chemotherapy as part of their cancer treatment. Our linear model, lasso showed performance (C-statistics) AUC: 0.75 (test dataset), 0.75 (training dataset); the Gradient Boosting Method (GBM) model showed AUC: 0.76 in the training and 0.79 in the test datasets. The feature selection derived from stepwise forward selection and backward elimination methods showed variables of importance--antineoplastic chemotherapy-induced pancytopenia, agranulocytosis due to cancer chemotherapy, fluid and electrolyte imbalance, age, anemia due to chemotherapy, median household income, and depression. Higher importance variable derived from GBM in the order of importance were antineoplastic chemotherapy-induced pancytopenia > co-morbidity score > agranulocytosis due to cancer chemotherapy > age > and fluid and electrolyte imbalance. Further, when the analysis was stratified to females only, the ML models performed better than the unstratified model.

Conclusion: Our study showed ML methods performed well in predicting the chemotherapy-induced UM. Predictors identified through ML approach matched to the clinically meaningful and previously discussed predictors of the chemotherapy-induced UM.

Keywords: Chemotherapy; Gradient Boosting Method; Lasso; Machine Learning; Oral Mucositis; Prediction; Ulcerative Mucositis.

MeSH terms

  • Adult
  • Female
  • Humans
  • Machine Learning
  • Mucositis*
  • Risk Factors