[ https://issues.apache.org/jira/browse/SOLR-16348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17582127#comment-17582127 ]
David Smiley commented on SOLR-16348: ------------------------------------- An alternative to an URP is a SolrEventListener implementing postCommit(). But in some bulk loading scenarios, a commit doesn't happen until the end which would be way too late. There is no postFlush; that happens in Lucene and is invisible to Solr. Any way an URP is fine – if there is more and more data being added, an URP is going to notice it. If indexing stops then no more splitting will happen, at least not with this mechanism. The split threshold is soft; it's not a guarantee. Some implementation thoughts after some brainstorming with colleagues: * Would be placed after DURP (post-distribution). * If we aren't the leader, don't initiate a split. The factory for the URP could make this check and skip creating the URP entirely. * Could place the logic in finish() so as to happen after the end of an indexing batch. ** Track additions via a simple boolean in add() to know if we should even compute anything at all by the time finish() is called. (perhaps a commit or delete was sent and nothing else). * Check to see if a split is in progress; don't want to re-request. Look for sub-shards in the cluster state. * Check the index size to see if it crosses a configurable threshold. Discount by percent of deleted docs. I think this logic is somewhere already. * Cap the number of splits per node (where the split is literally occurring – the parent shard leader) so we don't induce too much stress. ** This is maybe hard? My initial guess is a static Semaphore used by the URP. It's easy to acquire() here but the release() must be when the split finishes, and we don't want to block the URP lifecycle (which the client is waiting on) to wait for that. If the split call is done in another thread that waits as long as need be for the split to complete (possibly erroneously), it could release the permit in a finally. The common thread per core would also be useful to prevent a parallel instance of this URP (from another batch) trying to re-do the same work. ** There could still be a race, leading to an error that a split is in progress. The HTTP error code will be 510. Log it as a warning and move on. ** Such a limit is a soft, not a hard guarantee. * Could be effectively disabled via a system property substituted in solrconfig.xml to configure it if the size threshold is -1. ** The factory would simply not produce the URP; some other factories do this technique. > New SplitShard UpdateRequestProcessor > ------------------------------------- > > Key: SOLR-16348 > URL: https://issues.apache.org/jira/browse/SOLR-16348 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: UpdateRequestProcessors > Reporter: David Smiley > Priority: Major > > The > [SplitShard|https://solr.apache.org/guide/solr/latest/deployment-guide/shard-management.html#splitshard] > command is used to split a shard into smaller shards to get better query > scalability, especially across multiple machines. The most practical way to > use it is to split shards larger than a configured size. Of course shards > don't just grow by themselves; they grow when data is added. Here I propose > a new UpdateRequestProcessor that splits based on the shard size. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org