Hi all, I am starting to put together materials for the Python/RDKit training course I'm giving just before the RDKit UGM next month.
I would like to structure part of it around the SQLite release of the ChEMBL data set. More specifically, I plan to include examples of machine learning with scikit-learn, using RDKit descriptors and values from ChEMBL 24 (and making sure to use the new schema). Two problems. First, I'm not a computational chemist and I don't know what would constitute a good example to use. "Good" in this case means one whose outlines are well-known to likely students. Second, I don't have much experience with the ChEMBL data. My thought is to make a logP model. The easiest would be to based it on atom types. For this option, can anyone suggest where I can find logP data from ChEMBL? Another possibility is to use a pre-existing model, like the notebook George Papadatos did for Ligand-based Target Prediction at http://nbviewer.jupyter.org/gist/madgpap/10457778 . Perhaps someone here could point me to other existing resources along similar lines? Best regards, Andrew [email protected] ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

