Determining crucial factors for the popularity of scientific articles
Published in ACTA PHYSICA POLONICA A, 2020
Recommended citation: R. Jankowski, J. Sienkiewicz, Acta Phys Pol A 138(1), 41-47 (2020), doi: 10.12693/APhysPolA.138.41 http://przyrbwn.icm.edu.pl/APP/PDF/138/app138z1p06.pdf
In this paper we discuss the application of machine learning algorithms in bibliometrics. We explore multiple features from over 70 000 publications and find the concrete popularity threshold which results in the best ML prediction (in terms of MCC metric). Also using variable importance plot, we reduced significantly the number of input features for ML models sustaining the predictions at the same level.