Determining crucial factors for the popularity of scientific articles

Published in ACTA PHYSICA POLONICA A, 2020

Recommended citation: R. Jankowski, J. Sienkiewicz, Acta Phys Pol A 138(1), 41-47 (2020), doi: 10.12693/APhysPolA.138.41 http://przyrbwn.icm.edu.pl/APP/PDF/138/app138z1p06.pdf

In this paper we discuss the application of machine learning algorithms in bibliometrics. We explore multiple features from over 70 000 publications and find the concrete popularity threshold which results in the best ML prediction (in terms of MCC metric). Also using variable importance plot, we reduced significantly the number of input features for ML models sustaining the predictions at the same level.