Hi Paul,
I havn't used it on pubchem or anything of that size, however based on
validation work with 500 queries against 100K DB compounds (= max 50
million comparisons, see I guess you are looking at about something like
~ 10-30 seconds on 50 million compounds depending on the size of your
substructure that you are searching with. This is using the cartridge
obvously. These numbers are based on a 2.5 GHz Core2Duo inside a
MacBookPro.
There is also some useful information on those things in
http://code.google.com/p/rdkit/wiki/DatabaseCreation2
when you look there, you will see that for the emolecules DB (size ~ 5
million compounds) . As you can see search times depend heavily on the
query. The method of speeding up things by using subset pages is really
nice when you are using webpages to disaply results.
Please note the loading and fingerprint creation timings at the top of
that page ;-)
Hope this helps.
Cheers
Nik
[email protected]
01/19/2011 04:32 PM
To
RDKit Discuss <[email protected]>
cc
Subject
[Rdkit-discuss] PubChem search
Dear RDKit users,
has anyone used RDKit for local searches of PubChem?
Can be approximate numbers of the performance be given how long a
substructure search takes for, let's say, 50 million compounds?
Best regards,
Paul
This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended
recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and
does
not accept liability for any damages caused by any virus transmitted
therewith.
Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.
------------------------------------------------------------------------------
Protect Your Site and Customers from Malware Attacks
Learn about various malware tactics and how to avoid them. Understand
malware threats, the impact they can have on your business, and how you
can protect your company and customers by using code signing.
http://p.sf.net/sfu/oracle-sfdevnl
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_________________________
CONFIDENTIALITY NOTICE
The information contained in this e-mail message is intended only for the
exclusive use of the individual or entity named above and may contain
information that is privileged, confidential or exempt from disclosure
under applicable law. If the reader of this message is not the intended
recipient, or the employee or agent responsible for delivery of the
message to the intended recipient, you are hereby notified that any
dissemination, distribution or copying of this communication is strictly
prohibited. If you have received this communication in error, please
notify the sender immediately by e-mail and delete the material from any
computer. Thank you.
------------------------------------------------------------------------------
Protect Your Site and Customers from Malware Attacks
Learn about various malware tactics and how to avoid them. Understand
malware threats, the impact they can have on your business, and how you
can protect your company and customers by using code signing.
http://p.sf.net/sfu/oracle-sfdevnl
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss