Hi Jochen, one possibility is to convert all sdf-files to one SMILES file, that will be 1.9 GB. From that you can try to build a fs-index. Maybe split the SMILES file create different fs-indices and combine the results later. I would go with a database solution -> pgchem for the win!
Ciao, Bjoern > Hello Bjoern, > > My usechase is to do a similarity search on local sdf files with babel and > cdk. > > My problem now is that babel only one file expected and i have the complete > pubchem compound sdf files with over 2000 sdf files. > > If i concat all sdf files to one file it has over 187 GB and the index will > be over 2GB. > > What can i do? If i split i need to call my search but i want one output for > the search and not several. > > With best > > Jochen Schreiber > > Am 02.03.2012 um 18:52 schrieb Björn Grüning: > > > Hi Jochen, > > > > we are running such setup in our lab with postgresql and pgchem. Maybe > > you can explain your usecase a little bit better. I don't think the > > fs-index from openbabel is the way to go for such huge datasets. > > > > As Chris mentioned the index should be smaller than 2GB. You will exceed > > these. You can split your SDF files in chucks so that the resulting > > indices are <2GB ... > > > > Ciao, > > Bjoern > > > >> Must it be smaller then 2 GB? > >> > >> If i do a concat on all i gain a file which is about 187 GB. > >> > >> Any idea? > >> > >> With best > >> > >> Jochen Schreiber > >> > >> ------------------------------------------------------------------------------ > >> Virtualization & Cloud Management Using Capacity Planning > >> Cloud computing makes use of virtualization - but cloud computing > >> also focuses on allowing computing to be delivered as a service. > >> http://www.accelacomm.com/jaw/sfnl/114/51521223/ > >> _______________________________________________ > >> OpenBabel-discuss mailing list > >> OpenBabel-discuss@lists.sourceforge.net > >> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss > > > > -- > > Björn Grüning > > Albert-Ludwigs-Universität Freiburg > > Institute of Pharmaceutical Sciences > > Pharmaceutical Bioinformatics > > Hermann-Herder-Strasse 9 > > D-79104 Freiburg i. Br. > > > > Tel.: +49 761 203-4872 > > Fax.: +49 761 203-97769 > > E-Mail: bjoern.gruen...@pharmazie.uni-freiburg.de > > Web: http://www.pharmaceutical-bioinformatics.org/ > > > ------------------------------------------------------------------------------ Virtualization & Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ _______________________________________________ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss