On Jun 18, 10:29 am, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote: > brad wrote: > > Just wondering if anyone has ever solved this efficiently... not looking > > for specific solutions tho... just ideas. > > > I have one thousand words and one thousand files. I need to read the > > files to see if some of the words are in the files. I can stop reading a > > file once I find 10 of the words in it. It's easy for me to do this with > > a few dozen words, but a thousand words is too large for an RE and too > > inefficient to loop, etc. Any suggestions? > > Use an indexer, like lucene (available as pylucene) or a database that > offers word-indices. > > Diez
I've been toying around with Nucular (http://nucular.sourceforge.net/) a bit recently for some side projects. It's pure Python and seems to work fairly well for my needs. I haven't pumped all that much data into it, though. -- http://mail.python.org/mailman/listinfo/python-list