Hi Andrew,

Since you don't have access to the database in-house you may want to check
the web services for simple queries:
https://www.ebi.ac.uk/chembldb/index.php/ws
Wrt to the names, you're right, there's no explicit name for the structures
in the sd and the chemblid is included just as a property. This is how it
has been historically but I'll pass on your request.

Regards,

George
EMBL-EBI

On 2 May 2012 12:34, Andrew Dalke <[email protected]> wrote:

> Hi George,
>
> > This is probably not going to solve the problem at hand but it may be
> useful to you or others in the future:
> > ChEMBLdb maintains a molecular hierarchy table where you can retrieve
> the parent (=desalted - using Pipeline Pilot) structures for each molecular
> entity.
> > You may try something like this:
> >
> > select distinct cs.molregno, cs.molfile, cs.canonical_smiles
> > from compound_structures cs, molecule_hierarchy mh
> > where cs.molregno = mh.parent_molregno
>
> I confess pure ignorance here. While I've worked with databases, it's far
> from the list of things I know well. Reading the ERD is not simple for me,
> I don't have MySQL or Oracle installed on my machines, and I don't know how
> to browse through the schema and tables like I've seen those who are more
> database proficient than I do. So while I have an idea of what you are
> talking about, it's not something I can easily put into place.
>
> But as you say, it's not the problem, because RDKit's failure exception
> comes even using the original, unprocessed/un-de-salted record.
>
>
> Since you're here -- how come ChEMBL doesn't put an identifier on the
> first line of the SD record? Nearly all of them are blank; the exceptions
> are a dozen with mostly useless titles like:
>
> Acetic acid 6-(1-phenyl-ethyl)-6-aza-bicyclo[3.2.1]oct-3-yl ester
> 4-(4-Fluoro-phenyl)-2-methylsulfanyl-thiophene-3-carbonitrile
>
> 6-amino-9-(5-{[(1,2,3,3-tetrahydroxy-1,2,3-trioxidotriphosphanyl)oxy]methyl}tetr
> 2-Methyl-2,3-dihydro-benzofuran-7-carboxylic acid
> 8-methyl-8-aza-bicyclo[3.2.1]o
>
> (S)-N-((S)-1,6-diamino-1-oxohexan-2-yl)-1-((S)-5-guanidino-2-((2S,3S)-2-((S)-5-g
> Acetic acid 6-(1-phenyl-ethyl)-6-aza-bicyclo[3.2.1]oct-3-yl ester
>
>
> I end up doing a mol.SetProp("_Name", mol.GetProp("chembl_id")) so that my
> output SMILES have an identifier tied to them, and that seems like a
> needless extra step.
>
>
>                                Andrew
>                                [email protected]
>
>
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Rdkit-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to