[ 
https://issues.apache.org/jira/browse/SOLR-16348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17582127#comment-17582127
 ] 

David Smiley commented on SOLR-16348:
-------------------------------------

An alternative to an URP is a SolrEventListener implementing postCommit().  But 
in some bulk loading scenarios, a commit doesn't happen until the end which 
would be way too late.  There is no postFlush; that happens in Lucene and is 
invisible to Solr.  Any way an URP is fine – if there is more and more data 
being added, an URP is going to notice it.  If indexing stops then no more 
splitting will happen, at least not with this mechanism.  The split threshold 
is soft; it's not a guarantee.

Some implementation thoughts after some brainstorming with colleagues:
 * Would be placed after DURP (post-distribution).
 * If we aren't the leader, don't initiate a split.  The factory for the URP 
could make this check and skip creating the URP entirely.
 * Could place the logic in finish() so as to happen after the end of an 
indexing batch.
 ** Track additions via a simple boolean in add() to know if we should even 
compute anything at all by the time finish() is called. (perhaps a commit or 
delete was sent and nothing else).
 * Check to see if a split is in progress; don't want to re-request.  Look for 
sub-shards in the cluster state.
 * Check the index size to see if it crosses a configurable threshold.  
Discount by percent of deleted docs.  I think this logic is somewhere already.
 * Cap the number of splits per node (where the split is literally occurring – 
the parent shard leader) so we don't induce too much stress.
 ** This is maybe hard?  My initial guess is a static Semaphore used by the 
URP.  It's easy to acquire() here but the release() must be when the split 
finishes, and we don't want to block the URP lifecycle (which the client is 
waiting on) to wait for that.  If the split call is done in another thread that 
waits as long as need be for the split to complete (possibly erroneously), it 
could release the permit in a finally.  The common thread per core would also 
be useful to prevent a parallel instance of this URP (from another batch) 
trying to re-do the same work.
 ** There could still be a race, leading to an error that a split is in 
progress.  The HTTP error code will be 510.  Log it as a warning and move on.
 ** Such a limit is a soft, not a hard guarantee.
 * Could be effectively disabled via a system property substituted in 
solrconfig.xml to configure it if the size threshold is -1.
 ** The factory would simply not produce the URP; some other factories do this 
technique.

> New SplitShard UpdateRequestProcessor
> -------------------------------------
>
>                 Key: SOLR-16348
>                 URL: https://issues.apache.org/jira/browse/SOLR-16348
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: UpdateRequestProcessors
>            Reporter: David Smiley
>            Priority: Major
>
> The 
> [SplitShard|https://solr.apache.org/guide/solr/latest/deployment-guide/shard-management.html#splitshard]
>  command is used to split a shard into smaller shards to get better query 
> scalability, especially across multiple machines.  The most practical way to 
> use it is to split shards larger than a configured size.  Of course shards 
> don't just grow by themselves; they grow when data is added.  Here I propose 
> a new UpdateRequestProcessor that splits based on the shard size.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to