If the true requirement is merely to process the document once, then I agree with Shawn's solution. You needn't concern yourself with knowing who the leader is. If somehow it's important that it be guaranteed to execute on the leader in particular, then take inspiration from some existing URPs. I'm thinking this: https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/update/processor/SkipExistingDocumentsProcessorFactory.java#L217 (method isLeader). Also note that this particular URP implements RunAlways, thus its order can be before or after DURP as you please.
~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Sun, May 9, 2021 at 12:23 PM Shawn Heisey <apa...@elyograg.org> wrote: > On 5/8/2021 8:05 PM, lamine lamine wrote: > > I only want the code be run once per shard. One way to guarantee that > is to do it in the leader, as there is always one leader per shard. I don't > want to run it in all the replicas. The code to run is "external" it > doesn't touch any document. > > What I am saying is that when you define the processor chain, include > DistributedUpdateProcessor in it. It gets added by SolrCloud even if > you don't include it, so it's better to have control over its placement. > And then place your update processor in the list *BEFORE* > DistributedUpdateProcessor. > > This should accomplish your goals automatically, with no code required. > > Thanks, > Shawn >