Effects of randomness in the development of machine learning models in neuroimaging studies of schizophrenia

Numerous studies have used machine
learning with neuroimaging data for identifying individuals with a
schizophrenia diagnosis. However, inconsistent results have limited the
ability of the psychiatric community to objectively judge and accept the
value of this approach. One factor that has contributed to the
inconsistency, but has long been ignored, is randomness in the practice
of machine learning. This is manifest when executing the same machine
learning pipeline multiple times on the same dataset but getting
different results. In the current study, a dataset of anatomical MRI
scans from 158 patients with first-episode medication-naïve
schizophrenia and 166 matched controls was used to investigate the
effect of randomness on classifier performance estimates under different
algorithm complexity and data splitting ratios. The maximum
discriminatory accuracy that could be reached was 62.6 % ± 4.7 %
(43.5 %–79.3 %) obtained when using extra-trees classifiers without
feature normalization. Regions contributing to discrimination were
located at bilateral temporal lobes and right frontal lobe. The results
show that randomness has a significant impact on the precision of model
performance estimates, especially when the size of test set is small.
Current neuroimaging feature engineering combined with machine learning
still falls short of being able to make diagnoses in the clinical
context, but has value in revealing patterns of regional brain
alteration associated with the illness. The current results indicate
that effects of randomness on model performance should be reported and
considered in interpreting model utility and it is necessary to evaluate
models on large test sets to obtain valid estimates of model

Who Voted

Leave a comment