[Open Babel] using FastSearch class with Python

2010-11-23 Thread Floriane Montanari
Hi,
My request is gonna be very similar to this one in the Mail
Archive
.
I am working on a project completely similar to the one of Mikko Kasanen,
the difference being that we are using the Python API for our coding.
(Reminder: the user provides a smiles string or a file containing a molecule
and a Tanimoto threshold, and we would like to output the list of molecules
of our big database that are similar enough to the input).
I have noticed that the Pybel library doesn't support that similarity
search, but that openbabel.py should. But cannot use the FindSimilar()
method because of the type of the arguments. So basically my question is:
Is there a way to use the method FindSimilar() with Python? I think SWIG
doesn't support multimaps for Python.

Thank you for your support,
Regards

Floriane Montanari
--
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev___
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss


[Open Babel] Openbabel - RDKit interoperability... Valency problem when generating SMILES string

2010-11-23 Thread JP
Hi there @OpenBabel and Greg

I am using OpenBabel 2.3.0 and RDKit 201009 release and trying to get them
if not work together at least live peacefully :)

So I have a mol file coming from the pdb Astex Diverse Set (1hww.mol).  I
convert this to a smiles string using Openbabel

babel -xn 1hww.mol 1hww.smiles


(since I do not want to generate the name).  I get the attached smiles file
- [...@h]1(O)[...@h]2[n@H](CCC1)C[C@@H](O)[...@h]2o

When I try to load this using RDKit (i.e. AllChem.MolFromSmiles(smiles))  I
get

[19:04:57] Explicit valence for atom # 3 N, 4, is greater than permitted


If I copy the isomeric smiles (openeye)
c1...@h]([C@@H]2[C@@H]([C@@H](c...@]2c1)O)O)O
in the file instead of the openbabel generated smiles RDKit works fine...
This is probably a simple chemistry problem, but I am a CS guy so please
bear with me.

Now my question is: Who is the culprit? (select one)

a) No-one - The mol file is incorrect (but then again, why does Openbabel
read it with an error message?  Still I can read this directly through
RDKit)
b) Openbabel - it is generating an incorrect smiles string (and ideally this
is now fixed and available in 2.3.1 :) )
c) RDkit - The smiles is perfectly fine, please redirect your query
elsewhere
d) None of the above, the OS/hardware is playing tricks on you

I appreciate any help and any light you can shed.

Cheers
JP

PS These two frameworks are really great!! well done everyone!!


1hww.mol
Description: Binary data


1hww.smiles
Description: Binary data
--
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev___
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss


Re: [Open Babel] Openbabel - RDKit interoperability... Valency problem when generating SMILES string

2010-11-23 Thread Noel O'Boyle
The mol file has a nitrogen with 4 hydrogens, but the nitrogen does
not have a positive charge. This is messed up. Either the nitrogen has
only 3 hydrogens (which is probably correct) or it has 4 and a
positive charge.

The SMILES string from OpenBabel represents this MOL file accurately,
but OpenBabel should probably have given a warning when reading the
MOL file in the first place.

- Noel

On 23 November 2010 10:04, JP  wrote:
>
> Hi there @OpenBabel and Greg
> I am using OpenBabel 2.3.0 and RDKit 201009 release and trying to get them
> if not work together at least live peacefully :)
> So I have a mol file coming from the pdb Astex Diverse Set (1hww.mol).  I
> convert this to a smiles string using Openbabel
>>
>> babel -xn 1hww.mol 1hww.smiles
>
> (since I do not want to generate the name).  I get the attached smiles file
> -�...@h]1(O)[...@h]2[n@H](CCC1)C[C@@H](O)[...@h]2o
> When I try to load this using RDKit (i.e. AllChem.MolFromSmiles(smiles))  I
> get
>>
>> [19:04:57] Explicit valence for atom # 3 N, 4, is greater than permitted
>
> If I copy the isomeric smiles
> (openeye) c1...@h]([C@@H]2[C@@H]([C@@H](c...@]2c1)O)O)O in the file instead
> of the openbabel generated smiles RDKit works fine...
> This is probably a simple chemistry problem, but I am a CS guy so please
> bear with me.
> Now my question is: Who is the culprit? (select one)
> a) No-one - The mol file is incorrect (but then again, why does Openbabel
> read it with an error message?  Still I can read this directly through
> RDKit)
> b) Openbabel - it is generating an incorrect smiles string (and ideally this
> is now fixed and available in 2.3.1 :) )
> c) RDkit - The smiles is perfectly fine, please redirect your query
> elsewhere
> d) None of the above, the OS/hardware is playing tricks on you
> I appreciate any help and any light you can shed.
> Cheers
> JP
> PS These two frameworks are really great!! well done everyone!!
>
> --
> Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
> Tap into the largest installed PC base & get more eyes on your game by
> optimizing for Intel(R) Graphics Technology. Get started today with the
> Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
> http://p.sf.net/sfu/intelisp-dev2dev
> ___
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>
>

--
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
___
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss


Re: [Open Babel] Openbabel - RDKit interoperability... Valency problem when generating SMILES string

2010-11-23 Thread Noel O'Boyle
Sorry, I should have said not "hydrogens" but "bonds".

On 23 November 2010 10:25, Noel O'Boyle  wrote:
> The mol file has a nitrogen with 4 hydrogens, but the nitrogen does
> not have a positive charge. This is messed up. Either the nitrogen has
> only 3 hydrogens (which is probably correct) or it has 4 and a
> positive charge.
>
> The SMILES string from OpenBabel represents this MOL file accurately,
> but OpenBabel should probably have given a warning when reading the
> MOL file in the first place.
>
> - Noel
>
> On 23 November 2010 10:04, JP  wrote:
>>
>> Hi there @OpenBabel and Greg
>> I am using OpenBabel 2.3.0 and RDKit 201009 release and trying to get them
>> if not work together at least live peacefully :)
>> So I have a mol file coming from the pdb Astex Diverse Set (1hww.mol).  I
>> convert this to a smiles string using Openbabel
>>>
>>> babel -xn 1hww.mol 1hww.smiles
>>
>> (since I do not want to generate the name).  I get the attached smiles file
>> -�...@h]1(O)[...@h]2[n@H](CCC1)C[C@@H](O)[...@h]2o
>> When I try to load this using RDKit (i.e. AllChem.MolFromSmiles(smiles))  I
>> get
>>>
>>> [19:04:57] Explicit valence for atom # 3 N, 4, is greater than permitted
>>
>> If I copy the isomeric smiles
>> (openeye) c1...@h]([C@@H]2[C@@H]([C@@H](c...@]2c1)O)O)O in the file instead
>> of the openbabel generated smiles RDKit works fine...
>> This is probably a simple chemistry problem, but I am a CS guy so please
>> bear with me.
>> Now my question is: Who is the culprit? (select one)
>> a) No-one - The mol file is incorrect (but then again, why does Openbabel
>> read it with an error message?  Still I can read this directly through
>> RDKit)
>> b) Openbabel - it is generating an incorrect smiles string (and ideally this
>> is now fixed and available in 2.3.1 :) )
>> c) RDkit - The smiles is perfectly fine, please redirect your query
>> elsewhere
>> d) None of the above, the OS/hardware is playing tricks on you
>> I appreciate any help and any light you can shed.
>> Cheers
>> JP
>> PS These two frameworks are really great!! well done everyone!!
>>
>> --
>> Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
>> Tap into the largest installed PC base & get more eyes on your game by
>> optimizing for Intel(R) Graphics Technology. Get started today with the
>> Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
>> http://p.sf.net/sfu/intelisp-dev2dev
>> ___
>> OpenBabel-discuss mailing list
>> OpenBabel-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>>
>>
>

--
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
___
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss


Re: [Open Babel] Openbabel - RDKit interoperability... Valency problem when generating SMILES string

2010-11-23 Thread Greg Landrum
JP,

Noel's summary below is, in my opinion, dead on: The molfile has problems:
1) there's a 4 coordinate neutral nitrogen
2) it's missing the "M  END" line (maybe it's also missing the M  CHG
line that would fix the nitrogen?)

OpenBabel takes the bad mol file and, faithfully, generates a SMILES
that corresponds to it. The RDKit sees that valence violation and
complains.

-greg



On Tue, Nov 23, 2010 at 11:25 AM, Noel O'Boyle  wrote:
> The mol file has a nitrogen with 4 hydrogens, but the nitrogen does
> not have a positive charge. This is messed up. Either the nitrogen has
> only 3 hydrogens (which is probably correct) or it has 4 and a
> positive charge.
>
> The SMILES string from OpenBabel represents this MOL file accurately,
> but OpenBabel should probably have given a warning when reading the
> MOL file in the first place.
>
> - Noel
>
> On 23 November 2010 10:04, JP  wrote:
>>
>> Hi there @OpenBabel and Greg
>> I am using OpenBabel 2.3.0 and RDKit 201009 release and trying to get them
>> if not work together at least live peacefully :)
>> So I have a mol file coming from the pdb Astex Diverse Set (1hww.mol).  I
>> convert this to a smiles string using Openbabel
>>>
>>> babel -xn 1hww.mol 1hww.smiles
>>
>> (since I do not want to generate the name).  I get the attached smiles file
>> -�...@h]1(O)[...@h]2[n@H](CCC1)C[C@@H](O)[...@h]2o
>> When I try to load this using RDKit (i.e. AllChem.MolFromSmiles(smiles))  I
>> get
>>>
>>> [19:04:57] Explicit valence for atom # 3 N, 4, is greater than permitted
>>
>> If I copy the isomeric smiles
>> (openeye) c1...@h]([C@@H]2[C@@H]([C@@H](c...@]2c1)O)O)O in the file instead
>> of the openbabel generated smiles RDKit works fine...
>> This is probably a simple chemistry problem, but I am a CS guy so please
>> bear with me.
>> Now my question is: Who is the culprit? (select one)
>> a) No-one - The mol file is incorrect (but then again, why does Openbabel
>> read it with an error message?  Still I can read this directly through
>> RDKit)
>> b) Openbabel - it is generating an incorrect smiles string (and ideally this
>> is now fixed and available in 2.3.1 :) )
>> c) RDkit - The smiles is perfectly fine, please redirect your query
>> elsewhere
>> d) None of the above, the OS/hardware is playing tricks on you
>> I appreciate any help and any light you can shed.
>> Cheers
>> JP
>> PS These two frameworks are really great!! well done everyone!!
>>
>> --
>> Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
>> Tap into the largest installed PC base & get more eyes on your game by
>> optimizing for Intel(R) Graphics Technology. Get started today with the
>> Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
>> http://p.sf.net/sfu/intelisp-dev2dev
>> ___
>> OpenBabel-discuss mailing list
>> OpenBabel-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>>
>>
>

--
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
___
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss


Re: [Open Babel] Openbabel - RDKit interoperability... Valency problem when generating SMILES string

2010-11-23 Thread JP
Thanks Noel, Greg...

Its always difficult to decide on what to do: should something wrong be
parsed leniently or should it fail with error.  Its a philsophical debate as
well as technical one.

This argument comes up often enough with things like malformed XML - should
the parser fail with an error message or else go on reading the file on a
best effort basis (and possibly miss out on data because of, say, a broken
connection)

Anyways - thanks for your input.  I did what I had to do.  I used the
isomeric smiles representation found on the pdb.

On 23 November 2010 19:12, Greg Landrum  wrote:

> JP,
>
> Noel's summary below is, in my opinion, dead on: The molfile has problems:
> 1) there's a 4 coordinate neutral nitrogen
> 2) it's missing the "M  END" line (maybe it's also missing the M  CHG
> line that would fix the nitrogen?)
>
> OpenBabel takes the bad mol file and, faithfully, generates a SMILES
> that corresponds to it. The RDKit sees that valence violation and
> complains.
>
> -greg
>
>
>
> On Tue, Nov 23, 2010 at 11:25 AM, Noel O'Boyle 
> wrote:
> > The mol file has a nitrogen with 4 hydrogens, but the nitrogen does
> > not have a positive charge. This is messed up. Either the nitrogen has
> > only 3 hydrogens (which is probably correct) or it has 4 and a
> > positive charge.
> >
> > The SMILES string from OpenBabel represents this MOL file accurately,
> > but OpenBabel should probably have given a warning when reading the
> > MOL file in the first place.
> >
> > - Noel
> >
> > On 23 November 2010 10:04, JP  wrote:
> >>
> >> Hi there @OpenBabel and Greg
> >> I am using OpenBabel 2.3.0 and RDKit 201009 release and trying to get
> them
> >> if not work together at least live peacefully :)
> >> So I have a mol file coming from the pdb Astex Diverse Set (1hww.mol).
>  I
> >> convert this to a smiles string using Openbabel
> >>>
> >>> babel -xn 1hww.mol 1hww.smiles
> >>
> >> (since I do not want to generate the name).  I get the attached smiles
> file
> >> - [...@h]1(O)[...@h]2[n@H](CCC1)C[C@@H](O)[...@h]2o
> >> When I try to load this using RDKit
> (i.e. AllChem.MolFromSmiles(smiles))  I
> >> get
> >>>
> >>> [19:04:57] Explicit valence for atom # 3 N, 4, is greater than
> permitted
> >>
> >> If I copy the isomeric smiles
> >> (openeye) c1...@h]([C@@H]2[C@@H]([C@@H](c...@]2c1)O)O)O in the file
> instead
> >> of the openbabel generated smiles RDKit works fine...
> >> This is probably a simple chemistry problem, but I am a CS guy so please
> >> bear with me.
> >> Now my question is: Who is the culprit? (select one)
> >> a) No-one - The mol file is incorrect (but then again, why does
> Openbabel
> >> read it with an error message?  Still I can read this directly through
> >> RDKit)
> >> b) Openbabel - it is generating an incorrect smiles string (and ideally
> this
> >> is now fixed and available in 2.3.1 :) )
> >> c) RDkit - The smiles is perfectly fine, please redirect your query
> >> elsewhere
> >> d) None of the above, the OS/hardware is playing tricks on you
> >> I appreciate any help and any light you can shed.
> >> Cheers
> >> JP
> >> PS These two frameworks are really great!! well done everyone!!
> >>
> >>
> --
> >> Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
> >> Tap into the largest installed PC base & get more eyes on your game by
> >> optimizing for Intel(R) Graphics Technology. Get started today with the
> >> Intel(R) Software Partner Program. Five $500 cash prizes are up for
> grabs.
> >> http://p.sf.net/sfu/intelisp-dev2dev
> >> ___
> >> OpenBabel-discuss mailing list
> >> OpenBabel-discuss@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
> >>
> >>
> >
>



-- 

Jean-Paul Ebejer
Early Stage Researcher

InhibOx Ltd
Pembroke House
36-37 Pembroke Street
Oxford
OX1 1BP
UK

(+44 / 0) 1865 262 034



This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
Any unauthorised dissemination or copying of this email or its attachments,
and any use or disclosure of any information contained in them, is strictly
prohibited and may be illegal.  If you have received this email in error
please notify the sender and delete all copies from your system.

We and our group companies accept no liability or responsibility for
personal emails or emails unconnected with our business.

Internet communications including emails and access and use of web sites
cannot be guaranteed to be secure or error free as information can be
intercepted, corrupted, lost or arrive late. Furthermore, while we have
taken steps to control the spread of viruses on our systems, we cannot
guarantee that this email and any files transmitted with it are virus free.
No liability is accepted for any errors, omissions, interceptions, corrupted
mail, lost communications or