Thanks Mikhail, will try these approaches. On Thu, Jun 15, 2023 at 5:40 PM Mikhail Khludnev <m...@apache.org> wrote:
> From the other POV, a node can be excluded from LB pool via balancer API > before restart and brought back then. > > On Wed, Jun 14, 2023 at 6:09 PM Saksham Gupta > <saksham.gu...@indiamart.com.invalid> wrote: > > > Have you configured an URL for health check? Which one? > > The load balancer checks if the solr port is in use or not. If yes, then > it > > continues sending the search requests. > > > > Or you have something like rolling restart/recycle scenarios executed? > > No, we don't have anything like that configured. > > @ufuk We may have various scenarios where we have to restart a solr node, > > like changing heap, gc settings, etc. How can I avoid getting 5xx in > these > > scenarios? > > > > On Wed, Jun 14, 2023 at 1:02 PM Mikhail Khludnev <m...@apache.org> > wrote: > > > > > Saksham, can you comment on > > > > if a certain port is up or not and based on that send the request to > > that > > > node. > > > > > > Have you configured an URL for health check? Which one? > > > > > > > the coordinator node goes down after a request is sent from lb. > > > Do you mean nodes are failing more often than healthcheck occur? > > > > > > Or you have something like rolling restart/recycle scenarios executed? > > > > > > On Wed, Jun 14, 2023 at 8:52 AM Saksham Gupta > > > <saksham.gu...@indiamart.com.invalid> wrote: > > > > > > > @Ufuk We are using a load balancer to avoid a single point of failure > > > i.e. > > > > if all the requests have a single coordinator node then it would be a > > > major > > > > issue if this solr node goes down. > > > > > > > > @Mikhail Khludnev We already have a health check configured on load > > > > balancer, but the requests will fail if the coordinator node goes > down > > > > after request is sent from lb. > > > > Further explaining, the load balancer will check if a certain port is > > up > > > or > > > > not and based on that send the request to that node. The issue is > > > observed > > > > for cases where the coordinator node goes down after a request is > sent > > > from > > > > lb. > > > > > > > > Please let me know if I am missing something here. Any other > > > suggestions? > > > > > > > > On Tue, Jun 13, 2023 at 7:12 PM Mikhail Khludnev <m...@apache.org> > > > wrote: > > > > > > > > > Well, probably it's what Solr Operator can provide on Kubernetes. > > > > > > > > > > On Tue, Jun 13, 2023 at 10:47 AM ufuk yılmaz > > > <uyil...@vivaldi.net.invalid > > > > > > > > > > wrote: > > > > > > > > > > > Just wondered, solr cloud itself can handle node failings and > load > > > > > > balancing. Why use an external cloud load balancer? > > > > > > > > > > > > —ufuk yilmaz > > > > > > > > > > > > — > > > > > > > > > > > > > On 13 Jun 2023, at 10:28, Mikhail Khludnev <m...@apache.org> > > > wrote: > > > > > > > > > > > > > > Hello > > > > > > > You can configure healthcheck > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cloud.google.com/load-balancing/docs/health-check-concepts#criteria-protocol-http > > > > > > > with Solr's ping request handler > > > > > > > > > > https://solr.apache.org/guide/solr/latest/deployment-guide/ping.html > > > > . > > > > > > > Also, Google cloud has sophisticated Traffic Director, which > can > > > also > > > > > > suit > > > > > > > for node failover. > > > > > > > > > > > > > > On Tue, Jun 13, 2023 at 9:13 AM Saksham Gupta > > > > > > > <saksham.gu...@indiamart.com.invalid> wrote: > > > > > > > > > > > > > >> Hi team, > > > > > > >> We need help with the strategy used to request data from solr > > > cloud. > > > > > > >> > > > > > > >> *Current Searching Strategy:* > > > > > > >> We are using solr cloud 8.10 having 8 nodes with data sharded > on > > > the > > > > > > basis > > > > > > >> of an implicit route parameter. We send a search http request > on > > > > > > google's > > > > > > >> network load balancer which divides requests amongst the 8 > solr > > > > nodes. > > > > > > >> > > > > > > >> *Problem with this strategy:* > > > > > > >> If solr on any one of the nodes is down, the requests that > come > > to > > > > > this > > > > > > >> node give 5xx. > > > > > > >> > > > > > > >> We are thinking of other strategies like > > > > > > >> 1. adding 2 vanilla nodes to this cluster(which will contain > no > > > > data) > > > > > > which > > > > > > >> will be used for aggregating and serving requests i.e. instead > > of > > > > > > sending > > > > > > >> requests from lb to the 8 nodes, we will be sending the > requests > > > to > > > > > the > > > > > > new > > > > > > >> nodes which will send internal requests on other nodes and > fetch > > > > > > required > > > > > > >> data. > > > > > > >> 2. Instead of dividing requests using a load balancer, we can > > use > > > > > > zookeeper > > > > > > >> to connect with solr cloud. > > > > > > >> > > > > > > >> Would these strategies work? Is there a more optimized way > using > > > > which > > > > > > we > > > > > > >> can request on solr? > > > > > > >> > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Sincerely yours > > > > > > > Mikhail Khludnev > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Sincerely yours > > > > > Mikhail Khludnev > > > > > > > > > > > > > > > > > > -- > > > Sincerely yours > > > Mikhail Khludnev > > > > > > > > -- > Sincerely yours > Mikhail Khludnev >