Re: Lucene Index Cloud Replication

2019-07-11 Thread Anton Zenkov
Another +1. We are also big s3 + lucene users and it is very interesting what other people came up with. We have an S3 lucene directory that allows immediate read-only use of lucene indexes stored on s3 with simultaneous local caching and a prototype of segment based index replication based on the

Re: Lucene Index Cloud Replication

2019-07-09 Thread Michael McCandless
+1 to share code for doing 1) and 3) both of which are tricky! Safely moving / copying bytes around is a notoriously difficult problem ... but Lucene's "end to end checksums" and per-segment-file-GUID make this safer. I think Lucene's replicator module is a good place for this? Mike McCandless

Lucene Index Cloud Replication

2019-07-03 Thread Michael Froh
Hi there, I was talking with Varun at Berlin Buzzwords a couple of weeks ago about storing and retrieving Lucene indexes in S3, and realized that "uploading a Lucene directory to the cloud and downloading it on other machines" is a pretty common problem and one that's surprisingly easy to do poorl