[ https://issues.apache.org/jira/browse/SOLR-16712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Patson Luk updated SOLR-16712: ------------------------------ Description: The current implementation of PRS requires an extra param to the DocCollection, the `PrsSupplier`, when `get` is called, would fetch the PRS states from ZK. The implementation of such supplier `LazyPrsSupplier` would only fetch the state on first call. While this flow does work properly, this flow might introduce some unnecessary complexity: # PRS entry fetching from ZK is done either during or after the `DocCollection` construction, this could be a bit inconsistent with existing non PRS `DocCollection` design which `DocCollection` is simply a immutable container that does not fetch data after its instantiation # The lazy fetching could introduce some uncertainties as to when exactly the fetching happens (and if any Zookeeper IO exceptions arises) My guess was that the lazy loading was introduced in https://issues.apache.org/jira/browse/SOLR-16580 as to avoid fetching the PRS states multiple times in the ctor of `DocCollection`, however, if we only fetch the `PerReplicaStates` once on update before calling the `DocCollection` ctor, and pass the `PerReplicaStates` object to the `DocCollection` instead, it can probably achieve similar result but with reduced uncertainty after `DocCollection` construction. There's another branch which experimented with making DocCollection, Slice and Replica immutable as well for PRS enabled collection [https://github.com/cowpaths/fullstory-solr/pull/84] but is beyond the discussion of this Jira ticket was: The current implementation of PRS requires an extra param to the DocCollection, the `PrsSupplier`, when `get` is called, would fetch the PRS states from ZK. The implementation of such supplier `LazyPrsSupplier` would only fetch the state on first call. While this flow does work properly, this flow might introduce some unnecessary complexity: # PRS entry fetching from ZK is done either during or after the `DocCollection` construction, this could be a bit inconsistent with existing non PRS `DocCollection` design which `DocCollection` is simply a immutable container that does not fetch data after its instantiation # The lazy fetching could introduce some uncertainties as to when exactly the fetching happens (and if any Zookeeper IO exceptions arises) My guess was that the lazy loading was introduced in https://issues.apache.org/jira/browse/SOLR-16580 as to avoid fetching the PRS states multiple times in the ctor of `DocCollection`, however, if we only fetch the `PerReplicaStates` once on update before calling the `DocCollection` ctor, and pass the `PerReplicaStates` object to the `DocCollection` instead, it can probably achieve similar result but with reduced uncertainty after `DocCollection` construction. > Simplify PerReplicaStates (PRS) logic in DocCollection, replace PrsSupplier > with actual PerReplicaStates param > -------------------------------------------------------------------------------------------------------------- > > Key: SOLR-16712 > URL: https://issues.apache.org/jira/browse/SOLR-16712 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud > Affects Versions: main (10.0), 9.1.1 > Reporter: Patson Luk > Priority: Major > > The current implementation of PRS requires an extra param to the > DocCollection, the `PrsSupplier`, when `get` is called, would fetch the PRS > states from ZK. The implementation of such supplier `LazyPrsSupplier` would > only fetch the state on first call. > > While this flow does work properly, this flow might introduce some > unnecessary complexity: > # PRS entry fetching from ZK is done either during or after the > `DocCollection` construction, this could be a bit inconsistent with existing > non PRS `DocCollection` design which `DocCollection` is simply a immutable > container that does not fetch data after its instantiation > # The lazy fetching could introduce some uncertainties as to when exactly > the fetching happens (and if any Zookeeper IO exceptions arises) > > My guess was that the lazy loading was introduced in > https://issues.apache.org/jira/browse/SOLR-16580 as to avoid fetching the PRS > states multiple times in the ctor of `DocCollection`, however, if we only > fetch the `PerReplicaStates` once on update before calling the > `DocCollection` ctor, and pass the `PerReplicaStates` object to the > `DocCollection` instead, it can probably achieve similar result but with > reduced uncertainty after `DocCollection` construction. > > There's another branch which experimented with making DocCollection, Slice > and Replica immutable as well for PRS enabled collection > [https://github.com/cowpaths/fullstory-solr/pull/84] but is beyond the > discussion of this Jira ticket > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org