TY - JOUR
T1 - Knowledge Discovery of Patients Reviews on Breast Cancer Drugs: Segmentation of Side Effects Using Machine Learning Techniques
AU - Nilashi, Mehrbakhsh
AU - Ahmadi, Hossein
AU - Ali Abumalloh, Rabab
AU - Alrizq, Mesfer
AU - Alghamdi, Abdullah
AU - Alyami, Sultan
PY - 2024/9/27
Y1 - 2024/9/27
N2 - Breast cancer stands as the most frequently diagnosed life-threatening cancer among women worldwide. Understanding patients' drug experiences is essential to improving treatment strategies and outcomes. In this research, we conduct knowledge discovery on breast cancer drugs using patients’ reviews. A new machine learning approach is developed by employing clustering, text mining and regression techniques. We first use Latent Dirichlet Allocation (LDA) technique to discover the main aspects of patients' experiences from the patients’ reviews on breast cancer drugs. We also use Expectation-Maximization (EM) algorithm to segment the data based on patients’ overall satisfaction. We then use the Forward Entry Regression technique to find the relationship between aspects of patients' experiences and drug’s effectiveness in each segment. The textual reviews analysis on breast cancer drugs found 8 main side effects: Musculoskeletal Effects, Menopausal Effects, Dermatological Effects, Metabolic Effects, Gastrointestinal Effects, Neurological and Cognitive Effects, Respiratory Effects and Cardiovascular. The results are provided and discussed. The findings of this study are expected to offer valuable insights and practical guidance for prospective patients, aiding them in making informed decisions regarding breast cancer drug consumption.
AB - Breast cancer stands as the most frequently diagnosed life-threatening cancer among women worldwide. Understanding patients' drug experiences is essential to improving treatment strategies and outcomes. In this research, we conduct knowledge discovery on breast cancer drugs using patients’ reviews. A new machine learning approach is developed by employing clustering, text mining and regression techniques. We first use Latent Dirichlet Allocation (LDA) technique to discover the main aspects of patients' experiences from the patients’ reviews on breast cancer drugs. We also use Expectation-Maximization (EM) algorithm to segment the data based on patients’ overall satisfaction. We then use the Forward Entry Regression technique to find the relationship between aspects of patients' experiences and drug’s effectiveness in each segment. The textual reviews analysis on breast cancer drugs found 8 main side effects: Musculoskeletal Effects, Menopausal Effects, Dermatological Effects, Metabolic Effects, Gastrointestinal Effects, Neurological and Cognitive Effects, Respiratory Effects and Cardiovascular. The results are provided and discussed. The findings of this study are expected to offer valuable insights and practical guidance for prospective patients, aiding them in making informed decisions regarding breast cancer drug consumption.
U2 - 10.1016/j.heliyon.2024.e38563
DO - 10.1016/j.heliyon.2024.e38563
M3 - Article
SN - 2405-8440
JO - Heliyon
JF - Heliyon
M1 - e38563
ER -