RE: Possible to cause documents to be contiguous after forceMerge?

2016-11-15 Thread Ishan Chattopadhyaya
Can IndexSort help here? -Original Message- From: "Erick Erickson" Sent: ‎11/‎16/‎2016 9:29 To: "java-user" Subject: Re: Possible to cause documents to be contiguous after forceMerge? Well, codecs are pluggable so if you can show that you'd get an improvement (however you measure them)

Re: Possible to cause documents to be contiguous after forceMerge?

2016-11-15 Thread Erick Erickson
Well, codecs are pluggable so if you can show that you'd get an improvement (however you measure them) and that whatever you have in mind wouldn't penalize the general case you could submit it as a proposal/patch. Best, Erick On Tue, Nov 15, 2016 at 6:21 PM, Kevin Burton wrote: > On Tue, Nov 15,

Re: Possible to cause documents to be contiguous after forceMerge?

2016-11-15 Thread Kevin Burton
On Tue, Nov 15, 2016 at 6:16 PM, Erick Erickson wrote: > You can make no assumptions about locality in terms of where separate > documents land on disk. I suppose if you have the whole corpus at index > time you > could index these "similar" documents contiguously. T > Wow.. that's shockingly fr

Re: Possible to cause documents to be contiguous after forceMerge?

2016-11-15 Thread Erick Erickson
You can make no assumptions about locality in terms of where separate documents land on disk. I suppose if you have the whole corpus at index time you could index these "similar" documents contiguously. Then, assuming there was absolutely never any updates/deletes I _think_ the doc might tend to be

Possible to cause documents to be contiguous after forceMerge?

2016-11-15 Thread Kevin Burton
I have a large index (say 500GB) that with a large percentage of near duplicate documents. I have to keep the documents there (can't delete them) as the metadata is important. Is it possible to get the documents to be contiguous somehow? Once they are contiguous then they will compress very well