Picture map of Europe with pins indicating European capital cities

Open Access research with a European policy impact...

The Strathprints institutional repository is a digital archive of University of Strathclyde's Open Access research outputs. Strathprints provides access to thousands of Open Access research papers by Strathclyde researchers, including by researchers from the European Policies Research Centre (EPRC).

EPRC is a leading institute in Europe for comparative research on public policy, with a particular focus on regional development policies. Spanning 30 European countries, EPRC research programmes have a strong emphasis on applied research and knowledge exchange, including the provision of policy advice to EU institutions and national and sub-national government authorities throughout Europe.

Explore research outputs by the European Policies Research Centre...

Chemoinformatics-based classification of prohibited substances employed for doping in sport

Cannon, E. O. and Bender, A. and Palmer, D. S. and Mitchell, J. B. (2006) Chemoinformatics-based classification of prohibited substances employed for doping in sport. Journal of Chemical Information and Modeling, 46 (6). pp. 2369-2380.

Full text not available in this repository. Request a copy from the Strathclyde author


Representative molecules from 10 classes of prohibited substances were taken from the World Anti-Doping Agency (WADA) list, augmented by molecules from corresponding activity classes found in the MDDR database. Together with some explicitly allowed compounds, these formed a set of 5245 molecules. Five types of fingerprints were calculated for these substances. The random forest classification method was used to predict membership of each prohibited class on the basis of each type of fingerprint, using 5-fold cross-validation. We also used a k-nearest neighbors (kNN) approach, which worked well for the smallest values of k. The most successful classifiers are based on Unity 2D fingerprints and give very similar Matthews correlation coefficients of 0.836 (kNN) and 0.829 (random forest). The kNN classifiers tend to give a higher recall of positives at the expense of lower precision. A naïve Bayesian classifier, however, lies much further toward the extreme of high recall and low precision. Our results suggest that it will be possible to produce a reliable and quantitative assignment of membership or otherwise of each class of prohibited substances. This should aid the fight against the use of bioactive novel compounds as doping agents, while also protecting athletes against unjust disqualification.