Another +1. We are also big s3 + lucene users and it is very interesting
what other people came up with. We have an S3 lucene directory that allows
immediate read-only use of lucene indexes stored on s3 with simultaneous
local caching and a prototype of segment based index replication based on
the
+1 to share code for doing 1) and 3) both of which are tricky!
Safely moving / copying bytes around is a notoriously difficult problem ...
but Lucene's "end to end checksums" and per-segment-file-GUID make this
safer.
I think Lucene's replicator module is a good place for this?
Mike McCandless
Hi there,
I was talking with Varun at Berlin Buzzwords a couple of weeks ago about
storing and retrieving Lucene indexes in S3, and realized that "uploading a
Lucene directory to the cloud and downloading it on other machines" is a
pretty common problem and one that's surprisingly easy to do poorl