But why is it so costly?
In a regular query we walk postings and match document numbers, in a SpanQuery
we match position numbers (or position segments), what's the principal
difference?
I think it's just that #documents << #positions.
For "A,sg" and "A,pl" I use unordered SpanNearQueries with
On Fri, Oct 18, 2013 at 1:19 PM, Igor Shalyminov
wrote:
> OK, it turns out that DirectPostingsFormat is really an extreme thing: 8GB of
> index couldn't fit into 20+ java heap.
> I wonder if there is a postings format that works from disk the standard way
> but uses no compression?
Yes, it's v
Unfortunately, SpanNearQuery is a very costly query. What slop are you passing?
You might want to check out
https://issues.apache.org/jira/browse/LUCENE-5288 ... it adds
proximity boosting to queries, but it's still very early in the
iterating, and if you need a precise count of only those docume
Hello!
OK, it turns out that DirectPostingsFormat is really an extreme thing: 8GB of
index couldn't fit into 20+ java heap.
I wonder if there is a postings format that works from disk the standard way
but uses no compression?
--
Best Regards,
Igor
18.10.2013, 02:06, "Igor Shalyminov" :
> Mik
On 10/18/2013 1:08 AM, Shai Erera wrote:
The codec intercepts merges in order to clean up files that are no longer
referenced
What happens if a document is deleted while there's a reader open on the
index, and the segments are merged? Maybe I misunderstand what you meant by
this statement, but