What would be performance of pgSQL text search vs MySQL vs Lucene (flat file) for a 2 terabyte db?
thanks for any comments.
My experience with tsearch2 has been that indexing even moderately large chunks of data is too slow to be feasible. Moderately large meaning tens of megabytes.
My experience with MySQL's full text search as well as the various MySQL-based text indexing programs (forgot the names, it's been a while) for some 10-20GB of mail archives has been pretty disappointing too. My biggest gripe is with the indexing speed. It literally takes days to index less than a million documents.
I ended up using Swish++. Microsoft's CHM compiler also has pretty amazing indexing speed (though it crashes quite often when encountering bad HTML).
-- dave
---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster