Alberto Guevara-Tirado, School of Human Medicine, Universidad Científica del Sur, Lima, Peru
Background: Supervised learning algorithms can contribute to building efficient classification and prediction models around the multiple sclerosis (MS) phenotype. Objective: To identify and characterize the factors associated with primary progressive and relapsing-remitting multiple sclerosis phenotypes using a machine learning model, based on decision trees. Method: This was an analytical and cross-sectional study from a secondary source. The variables were phenotype, age, sex, glucocorticoids, cigarette consumption, and modifying therapy. The decision tree was used using the chi-square automatic interaction detector and binary logistic regression. Results: The tree correctly classified (87%) patients with a smoking history between 51 and 70 years of age as characteristics associated with primary progressive MS (PPMS). In relapsing remitting MS (RRMS), the group with the greatest association was women between 18 and 50 years old. When including disease-modifying medications (correct prognoses: 89.70%), the groups associated with PPMS were history of smoking, treated with teriflunomide, rituximab, glatiramer and ocrelizumab between 51 and 70 years old, men between 18 and 50 years old with ocrelizumab and rituximab. For RRMS, they were women 18 to 50 years old with ocrelizumab and rituximab. Patients aged 18 to 50 years with dimethyl fumarate, teriflunomide, interferon, glatiramer, fingolimod, natalizumab, cladribine, and alemtuzumab. Conclusions: Machine learning using decision trees with easily accessible data is efficient in rapidly classifying personal factors and pharmacological profiles associated with RRMS and PPMS. Likewise, smoking history is a predictor of PPMS. The decision tree could help neurologists and epidemiologists by providing additional information to make clinical, therapeutic, and epidemiological decisions.
Keywords: Relapsing-remitting multiple sclerosis. Chronic progressive multiple sclerosis. Population characteristics. Supervised machine learning. Computer-assisted decision making.