[
https://issues.apache.org/jira/browse/SOLR-11166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105693#comment-16105693
]
Jan Høydahl commented on SOLR-11166:
------------------------------------
Are you assuming docs will not be updated or deleted, so no automatic shard
merging? If shard merging, need new merge policy?
Question: Is shard-level the best abstraction here or could time-based use
cases just as well be solved on the collection level? Create a write-alias
pointing to the newest collection, and read aliases pointing to all or some
other subset of collections. In this setup, newer collections could have larger
replicationFactor to support more queries. And you could reduce #shards for
older collections, merge collections and define the oldest collections as
"archive" which are loaded lazily on demand only etc... People do this already
and one could imagine built-in support for all the collection creation and
alias housekeeping.
> Create a new router to automate time-based sharding
> ---------------------------------------------------
>
> Key: SOLR-11166
> URL: https://issues.apache.org/jira/browse/SOLR-11166
> Project: Solr
> Issue Type: New Feature
> Security Level: Public(Default Security Level. Issues are Public)
> Components: SolrCloud
> Reporter: Shawn Heisey
>
> There has been some interest in time-based sharding on the mailing list. It
> is disappointing to have to inform users that if they want to implement this,
> they will be entirely responsible for all shard management.
> I think that creating new shards and directing new documents to the newest
> shard will be relatively easy parts of this router. There are some
> additional features that I think would be really useful:
> Automated merging of older shards into larger time periods.
> Some kind of aliasing to allow searching today, yesterday, this week, last
> week, this month, last month, and so on, with the list of searched shards
> changing when a new shard is created. I don't know if this would be part of
> the router or implemented elsewhere by another issue.
> The router should have a configuration option to indicate how frequently new
> shards should be created. Valid values should probably be hourly, daily,
> weekly, and monthly. Multiples of those time periods would be a very good
> idea. Another idea to consider: completely custom time periods.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]