I think this MCS work I did last night is pretty neat. I got the list of 
structures in every ChEBI structure ontology node, found the MCS for the set, 
and generated a (very large) visualization of the entire results set.

It's at http://dalkescientific.com/fmcs_chebi.html.bz2 . The 7.5 MB file 
uncompresses to 166 MB, in part because I wasn't concerned about space when I 
wrote the code to integrate with the SMARTSViewer and Daylight image services.

Some background. Last evening I downloaded the most recent ChEBI chemical 
ontology. This organizes structures into one or more families, and organizes 
the families in turn into higher level families.

The result is a hierarchy. For example, CHEBI:33567 is catecholamine, which 
includes as children:

   CHEBI:37950 hexoprenaline
   CHEBI:50580 arbutamine
   CHEBI:6257 L-isoprenaline
   CHEBI:33568 adrenaline
   CHEBI:33569 noradrenaline
   CHEBI:4670 dobutamine
   CHEBI:18243 dopamine

It's in turn a child of the following set of relations:

 CHEBI:33822 organic hydroxy compound
   CHEBI:23824 diol
     CHEBI:22625 aromatic diol
       CHEBI:33570 benzenediols
         CHEBI:33566 catechols

You can see its full hierarchy in 
http://www.ebi.ac.uk/chebi/chebiOntology.do?treeView=true&chebiId=CHEBI:33567#graphView

My question is, how well does my MCS code work when given one of the 
intermediate nodes and the list of structures which make up the node?

Note: not all of these have a simple substructure scaffold which my MCS code 
can detect. Their paper at http://www.biomedcentral.com/1471-2105/13/3 points 
out that certain patterns, like ester, require a SMARTS to match C(=[O,S])OC 
and that some terms, like bicyclic molecules, cannot be captured via a simple 
pattern.

It took about 50 minutes to process the MCSes, once I fixed a bug in my code, 
and several hours to understand the EBI data format and to make a relatively 
interesting visualization.

I hope you enjoy exploring the result - let me know if there's anything 
interesting you find!


                                Andrew
                                [email protected]



------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to