Didn’t look at the code but from the number of digits wouldn’t it be a long
wrapping around into negative territory?

On Tue 3 Dec 2024 at 02:55, Patrick Lok <patrick....@salesforce.com.invalid>
wrote:

> Hi,
>
> We are seeing some weird issues with the Overseer ID which causes some
> overseer election problems in our cluster.
>
> Recently we have noticed that one of our Solr 8 clusters is having trouble
> electing dedicated overseer hosts as leader. After some investigation, we
> noticed that we are having "negative" Overseer ID (Overseer ID with leading
> dash"
>
> [zk: localhost:2181(CONNECTED) 0] ls /overseer_elect/election
> [-5188057493699159958-1.1.1.15:8983_solr-n_0000192189,
> -5260098076001480373-
> 1.1.1.19:8983_solr-n_0000192192,
> -5548288611309897871-1.1.1.28:8983_solr-n_0000192191,
> -6124715353171356222-1.1.1.18:8983_solr-n_0000192188, -6412935227404643144-
> 1.1.1.22:8983_solr-n_0000192186,
> -6412935227404648050-1.1.1.89:8983_solr-n_0000192181,
> -6557083032988176767-1.1.1.105:8983_solr-n_0000192190,
> -6701159159471144532-
> 1.1.1.219:8983_solr-n_0000192183]
>
>
> (the actual IP addresses are different from what pasted above)
>
> Because of the leading dash in the Overseer ID, it causes the
> LeaderElector.getNodeName() to return "5188057493699159958-1.1.1.15
> :8983_solr" instead "1.1.1.15:8983_solr" causing quite a bit of issues.
>
> Does anyone know why we started seeing a leading dash with the initial set
> of digits in the Overseer ID? Who's generating that set of digits? Solr or
> ZooKeeper? Is there a way to fix it?
>
> A simple change to LeaderElector.NODE_NAME seems to be an easy fix. But
> since there's no unit test around it, I'm a bit worried that it might break
> somewhere else in the code.
>
> Thanks,
> Patrick
>

Reply via email to