Chris,
Many thanks! This did the trick. =)
Wallace
On Sat, Jul 26, 2014 at 11:02 AM, Chris Morley <c.mor...@gaseq.co.uk> wrote:
> On 24/07/2014 17:06, Wallace Chan wrote:
> > Tim,
> >
> > Thanks for your reply. Yes, we have the canonical SMILES strings stored
> > as properties in our glass.sdf file. I tried to generate canonical SMILES
> > as the result, and they are different than ours. Thus, ours were
> > probably acquired using a different canonicalization.
>
> It is possible to recover your canonical SMILES from glass.sdf and add
> it to the title of the results file:
>
> obabel glass.sdf -ifs -O results.smi -sc1ccccc1 --append SMILESSTRING
>
> where SMILESSTRING is the name of the sdf property. You could also
> construct a canonical SMILES file:
>
> obabel glass.sdf -ifs -O results.smi -otxt -sc1ccccc1 --title ""
> --append "SMILESSTRING"
>
> For each matching molecule, the output format txt gives just the title,
> which --title "" removes; the SMILES is then added. Other properties or
> descriptors could be added, e.g. --append "SMILESSTRING inchi"
>
> This then leads to
> > another question that has come to me. Does the input for substructure or
> > similarity searching have to be in SDF format or can it be another
> > format, such as a list of InChI ID's? In other words, does the fast
> > search index have to come from an SDF file? Many thanks.
>
> The datafile (and the output query results) can be in any format,
> including inchi.
>
> Chris
> >
> > On Tue, Jul 22, 2014 at 7:44 PM, Tim Vandermeersch
> > <tim.vandermeer...@gmail.com <mailto:tim.vandermeer...@gmail.com>>
> wrote:
> >
> > Hi,
> >
> > I assume you have canonical SMILES strings in glass.sdf stored as
> > titles or properties. Correct me if this is incorrect. If so, it
> > depends on what program was used to create these canonical SMILES
> > strings. If you used openbabel for this, you can convert the
> > molecules in result.smi to openbabel canonical SMILES (or write
> > canonical SMILES directly using the .can extension).
> >
> > In the case where another program was used to generate the canonical
> > SMILES, it would not be possible to use openbabel to generate the
> > same canonical SMILES starting from result.smi. If you have access
> > to the other program you could use this to convert results.smi to
> > these canonical SMILES and use these to search glass.sdf.
> >
> > The reason for this is that there is no universal SMILES
> > canonicalization algorithm. Different toolkits will result in
> > different canonical SMILES (which are canonical only when using the
> > same toolkit). InChI on the hand has a single reference
> implementation.
> >
> > Tim
> >
> >
> > On Wed, Jul 23, 2014 at 12:03 AM, Wallace Chan <walla...@umich.edu
> > <mailto:walla...@umich.edu>> wrote:
> >
> > Dr. Hutchison,
> >
> > Yes, this helps. I do have another question about substructure
> > searching. We are building a database with roughly 270,000
> > molecules and want users to be able to do a substructure and
> > similarity search. I've read the following documentation,
> > http://openbabel.org/docs/dev/Fingerprints/fingerprints.html,
> > and it helps in understand how this process works. However, I
> > want to ask whether or not the output file from the query can
> > contain the exact same SMILES strings that were generated from
> > the fast search index. Currently, the SMILES strings generated
> > from the query in the result.smi file are not the canonical
> > SMILES that I used to create the fast search index. For example,
> > if I were to look for a benzene substructure with the following
> > command,
> >
> > *babel glass.fs -ifs -sc1ccccc1 result.smi*
> >
> > would I be able to retrieve the SMILES string from glass.sdf,
> > which was used to create glass.fs? Many thanks for your patience.
> >
>
>
> ------------------------------------------------------------------------------
> Want fast and easy access to all the code in your enterprise? Index and
> search up to 200,000 lines of code with a free copy of Black Duck
> Code Sight - the same software that powers the world's largest code
> search on Ohloh, the Black Duck Open Hub! Try it now.
> http://p.sf.net/sfu/bds
> _______________________________________________
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>
--
Wallace Chan
PhD Candidate
Zhang Lab
Department of Biological Chemistry
University of Michigan
walla...@umich.edu
------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls.
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss