Great answer
Thanks Michael.
Yes the difference was too much > 1G
Best regards
> On Nov 13, 2020, at 1:49 PM, Michael Sokolov wrote:
>
> You can't directly compare disk usage across two indexes, even with
> the same data. Try re-indexing one of your datasets, and you will see
> that the disk s
You can't directly compare disk usage across two indexes, even with
the same data. Try re-indexing one of your datasets, and you will see
that the disk size is not the same. Mostly this is due to the way
segments are merged varying with some randomness from one run to
another, although the size of
With Zulia we chose to rewrite fieldName:* queries to hiddenField:fieldName
and add all field names that are present to a hidden field automatically as
Uwe described as an alternative. It seems to work well.
https://github.com/zuliaio/zuliasearch/blob/master/zulia-query-parser/src/main/java/io/zu
Nothing changed between two index generations except the data changed a
bit as i described.
When Lucene is done generating index, that is what i am reporting as the
size of the directory where all index files are stored.
I dont know about deleted docs? How do you trace that? yes the queries
Hi,
Solr and Elasticsearch implement the exists query like this, which is fully in
line with your investigation: if a field has docvalues it uses
DocValuesFieldExistsQuery, if it is a tokenized field it uses the
NormsFieldExistsQuery. The negative one is a must-not clause, which is
perfectly f
That's great Rob! Thanks for bringing closure.
Mike McCandless
http://blog.mikemccandless.com
On Fri, Nov 13, 2020 at 9:13 AM Rob Audenaerde
wrote:
> To follow up, based on a quick JMH-test with 2M docs with some random data
> I see a speedup of 70% :)
> That is a nice friday-afternoon gift,
To follow up, based on a quick JMH-test with 2M docs with some random data
I see a speedup of 70% :)
That is a nice friday-afternoon gift, thanks!
For ppl that are interested:
I added a BinaryDocValues field like this:
doc.add(BinaryDocValuesField("GROUPS_ALLOWED_EMPTY", new BytesRef(0x01;
Maybe NormsFieldExistsQuery as a MUST_NOT clause? Though, you must enable
norms on your field to use that.
TermRangeQuery is indeed a horribly costly way to execute this, but if you
cache the result on each refresh, perhaps it is OK?
You could also index a dedicated doc values field indicating t
Hi all,
We have implemented some security on our index by adding a field
'groups_allowed' to documents, and wrap a boolean must query around the
original query, that checks if one of the given user-groups matches at
least one groups_allowed.
We chose to leave the groups_allowed field empty when t
What does “final finished sizes” mean? After optimize of just after finishing
all indexing?
The former is what counts here.
And you provided no information on the number of deleted docs in the two cases.
Is
the number of deletedDocs the same (or close)? And does the q=*:* query
return the same
10 matches
Mail list logo