Hi Jochen,

one possibility is to convert all sdf-files to one SMILES file, that
will be 1.9 GB. From that you can try to build a fs-index. Maybe split
the SMILES file create different fs-indices and combine the results
later.
I would go with a database solution -> pgchem for the win!

Ciao,
Bjoern
 

> Hello Bjoern,
> 
> My usechase is to do a similarity search on local sdf files with babel and 
> cdk.
> 
> My problem now is that babel only one file expected and i have the complete 
> pubchem compound sdf files with over 2000 sdf files.
> 
> If i concat all sdf files to one file it has over 187 GB and the index will 
> be over 2GB.
> 
> What can i do? If i split i need to call my search but i want one output for 
> the search and not several.
> 
> With best
> 
> Jochen Schreiber
> 
> Am 02.03.2012 um 18:52 schrieb Björn Grüning:
> 
> > Hi Jochen,
> > 
> > we are running such setup in our lab with postgresql and pgchem. Maybe
> > you can explain your usecase a little bit better. I don't think the
> > fs-index from openbabel is the way to go for such huge datasets.
> > 
> > As Chris mentioned the index should be smaller than 2GB. You will exceed
> > these. You can split your SDF files in chucks so that the resulting
> > indices are <2GB ...
> > 
> > Ciao,
> > Bjoern
> > 
> >> Must it be smaller then 2 GB?
> >> 
> >> If i do a concat on all i gain a file which is about 187 GB.
> >> 
> >> Any idea?
> >> 
> >> With best
> >> 
> >> Jochen Schreiber
> >> 
> >> ------------------------------------------------------------------------------
> >> Virtualization & Cloud Management Using Capacity Planning
> >> Cloud computing makes use of virtualization - but cloud computing 
> >> also focuses on allowing computing to be delivered as a service.
> >> http://www.accelacomm.com/jaw/sfnl/114/51521223/
> >> _______________________________________________
> >> OpenBabel-discuss mailing list
> >> OpenBabel-discuss@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
> > 
> > -- 
> > Björn Grüning
> > Albert-Ludwigs-Universität Freiburg
> > Institute of Pharmaceutical Sciences
> > Pharmaceutical Bioinformatics
> > Hermann-Herder-Strasse 9
> > D-79104 Freiburg i. Br.
> > 
> > Tel.:  +49 761 203-4872
> > Fax.:  +49 761 203-97769
> > E-Mail: bjoern.gruen...@pharmazie.uni-freiburg.de
> > Web: http://www.pharmaceutical-bioinformatics.org/
> > 
> 



------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to