[
https://issues.apache.org/jira/browse/SOLR-12941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672265#comment-16672265
]
Andrzej Bialecki commented on SOLR-12941:
------------------------------------------
Another interesting failure that I and Shalin spotted while debugging test
failures is that occasionally older commit points linger around for a while,
and they throw off the size calculations too (because the actual directory size
consists of all segments from the previous commit point plus all segments from
the latest commit point).
The trigger should probably consider only the latest commit point because older
ones will eventually be deleted. This will require adding a metric (gauge) to
SolrCore to report the details of only the latest commit point.
> IndexSizeTrigger and splitMethod=link problems
> ----------------------------------------------
>
> Key: SOLR-12941
> URL: https://issues.apache.org/jira/browse/SOLR-12941
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Affects Versions: 7.6, master (8.0)
> Reporter: Andrzej Bialecki
> Assignee: Andrzej Bialecki
> Priority: Major
>
> {{IndexSizeTrigger}} can be configured to use {{splitMethod=link}}
> (SOLR-12730), which uses hard-linking for creating sub-shards.
> However, if the trigger uses {{aboveBytes}} condition the resulting
> sub-shards will not immediately decrease in size, until all of the deleted
> documents will be expunged (either by gradual merges or by explicit and
> costly expungeDeletes command). As a result the new sub-shards will still
> exceed the {{aboveBytes}} threshold, which will cause the trigger to keep
> generating new split requests.
> I see two options how to solve this:
> * disallow using {{aboveBytes}} with {{splitMethod=link}}. This
> unfortunately is a very desirable combination because it monitors the actual
> index size and uses the fast splitting method.
> * calculate an internal estimate of "eventual index size" for an index with
> deletions, and use this estimate when checking with {{aboveBytes}} instead of
> the real index size. This of course introduces a potentially significant
> estimation error but allows to properly treat hard-linked sub-shards with
> deletions as (eventually) significantly smaller than the parent shard.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]