agc wrote: > I guess an important feature of what I'm looking for is > some kind of mapping from *exact* title to corresponding article, > i.e. if my data set wasn't so large, I would just keep all my > data in a in-memory Python dictionary, which would be very fast. > > But I have about 2 million article titles mapping to approx. 6-10 GB > of article bodies, so I think this would be just to big for a > simple Python dictionary.
Then use a database table that maps titles to articles, and make sure you create an index over the title column. Stefan -- http://mail.python.org/mailman/listinfo/python-list