Dear Gonzalo,

On Tue, Jul 24, 2012 at 2:00 PM, Gonzalo Colmenarejo-Sanchez
<[email protected]> wrote:
>
> I’ve been doing speed comparisons of SMARTS matching calculations between
> Daylight (dt_match) and the latest release of the RDKit (SubstructMatch). A
> matrix of 4015 SMILES matched against 1390 SMARTS took 187 s in DL, while it
> took 1615 s in the RDKit program. Maybe this is an area of improvement of
> the RDKit.

Thanks for that information; things like this are very useful.
Can you share how you're doing the comparison or, even better, the
SMARTS you are using? The reason I ask is that this seems a lot slower
than I would expect, so I wonder if you are constructing the molecules
from SMILES and the queries from SMARTS before each substructure
matching call (this would be extremely slow) or if you build the
molecules and queries once. Two of the regular benchmarks I run with
the RDKit (http://code.google.com/p/rdkit/wiki/Benchmarking) involve
substructure searching (t7 and t8 in that table) and there doing 428K
matches takes about 6 seconds on my linux box. Another example
calculation where I search for 500 substructures in 11000 molecules
(about the same scale as your test) takes about 26 seconds.

Best,
-greg

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to