[ https://issues.apache.org/jira/browse/SOLR-16348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17582343#comment-17582343 ]
Gus Heck commented on SOLR-16348: --------------------------------- This sounds like a really cool feature. One of the key pain points for several customers I've had is the difficulty in predicting the size of a tenant's index, and managing clients that grow beyond expectations. This feature will shine best when it is in a system receiving a steady flow of data I think. I wonder about the bulk data-loading scenario.... my first reaction is that bulk loading of actual text documents (vs small iot or low text analysis documents) can often be waiting on solr. I know of systems that peg solr for most of a week when they want to re-index. Having it split a shard in the background on a separate thread could easily have that thread get starved, and maybe even have multiple split routines backlogged? Alternately if it's synchronized with the accepting of documents, we have to defend ourselves from OOM by not accepting documents, and the sending system needs to be able to handle a pause in accepting documents... I feel like this should be turned off in the bulk/re-index use case. In the full index bulk case you typically have an idea of how much data will be loaded, and can prepare the index, and when re-indexing you have the prior index as an example, so the shards should just be set correctly to begin with. "Daily bulk" cases that peg solr for 1-2h at night will be a harder case. Maybe that case should suspend/resume splitting based on a (configurable) period of silence, or load reduction (not that I think that's easy). Another alternative to disabling is setting up a /reindex or /bulk handler that lacks your URP. I'd rather not have it be a system property because one may want to turn it on/off simultaneously across the cluster and maybe only for one collection at a time. A collection property in zk sounds better. Another even harder case to think about is periods of unplanned load, possibly ramping up gradually and then tapering off (i.e. a pattern like something in social media going viral). That almost requires the decision to split or not to be load based which is a sticky problem. One could have a request parameter, but that then relies on nobody else sending an update you don't know about, so I don't like that option. Many mature organizations have multiple paths for data to enter the index and coordination like that is infeasible. The non-cloud case is harder because zk can't coordinate things so enable/disable of this maybe the ability to turn this on/off dynamically is a cloud feature? Though if it were reacting to load that might be shard level and not require coordination. This also has an interaction with systems routing by tenant id or other business id that rely on co-location for graph/join operations. Finally, there's the question of what to do with the old shard... the present split command is documented to leave the old shard in place, so split leaves you with 3, two of which are in use. Also this is another case in which users need to be careful to have enough free disk. This operation filling up the disk could then cause issues writing new docs... Kind of a thought dump there, but hope it helps. > New SplitShard UpdateRequestProcessor > ------------------------------------- > > Key: SOLR-16348 > URL: https://issues.apache.org/jira/browse/SOLR-16348 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: UpdateRequestProcessors > Reporter: David Smiley > Priority: Major > > The > [SplitShard|https://solr.apache.org/guide/solr/latest/deployment-guide/shard-management.html#splitshard] > command is used to split a shard into smaller shards to get better query > scalability, especially across multiple machines. The most practical way to > use it is to split shards larger than a configured size. Of course shards > don't just grow by themselves; they grow when data is added. Here I propose > a new UpdateRequestProcessor that splits based on the shard size. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org