Hi Willy, I'm just checking in to see if there's anything left I can help address here.
Thank you, Anthony On Wed, Mar 13, 2024 at 4:59 PM Willy Tarreau <w...@1wt.eu> wrote: > > Hi Anthony, > > On Wed, Mar 13, 2024 at 12:33:54PM -0400, Anthony Deschamps wrote: > > Hi Willy, > > > > My original concern was that if two servers had values of lb_server_key > > that are close to each other, then there could be some overlap in their > > values of lb_server_key + node_index;, which is why I was looking for a > > reasonable hash function to combine them. > > I know but you can't really avoid it anyway, and that's why we place > many nodes. I tried an attempt in the past where nodes were evenly > spaced by dividing by the number of entries etc. It was a disaster. > As soon as a server went down, the next one would take its extra load > for example. So anything you try to do to avoid some corner cases create > other ones, while overall, a random distribution (like a hash does) > gives great results. > > > I based it on boost::hash_combine, > > but since writing that I came across https://stackoverflow.com/a/50978188 > > explaining why boost::hash_combine isn't necessarily a good hash function. > > Hehe. We've all written our own hash functions for a given purpose at one > point in time, and they're usually fine inside their context and bad once > used outside. That's also where XXH and such stuff shines, it brings > generic functions that are good all over the space. In our case, the > full_hash() is just an avalanche non-linear 1:1 mapping between the input > and the output which suffices to avoid resonance effects that come from > repeated additions, it's perfectly suited here I think. > > > I agree that it's not too critical to have a few collisions, but I'm > > reaching > > the limits of my knowledge here. Does this change address your concerns? > > I've attached an updated patch, and here's the diff between them: > > > > diff --git a/src/lb_chash.c b/src/lb_chash.c > > index c77222854..d7a1b2539 100644 > > --- a/src/lb_chash.c > > +++ b/src/lb_chash.c > > @@ -66,9 +66,7 @@ static inline void chash_dequeue_srv(struct server *s) > > */ > > static inline u32 chash_compute_node_key(struct server *s, unsigned > > node_index) > > { > > - u32 server_key = s->lb_server_key; > > - u32 index_key = full_hash(node_index); > > - return server_key ^= index_key + 0x9e3779b9 + (server_key << 6) + > > (server_key >> 2); > > + return full_hash(s->lb_server_key + node_index); > > } > > > > /* Compute the base server key that will be used to compute node keys. > > Servers > > @@ -112,8 +110,7 @@ static inline u32 chash_compute_server_key(struct > > server *s) > > case SRV_HASH_KEY_ADDR: > > switch (srv_addr.family) { > > case AF_INET: > > - u32 addr_key = full_hash(srv_addr.addr.v4.s_addr); > > - key ^= addr_key + 0x9e3779b9 + (key << 6) + (key >> 2); > > + key = full_hash(key + srv_addr.addr.v4.s_addr); > > break; > > case AF_INET6: > > key = XXH32(srv_addr.addr.v6.s6_addr, 16, key); > > Yeay I think it addresses everything. I'll re-test your updated patch > tomorrow hoping that this time I'll merge it :-) > > Thanks for your patience! > Willy