tveasey commented on PR #14420:
URL: https://github.com/apache/lucene/pull/14420#issuecomment-2761706401
I discussed this a bit with Mayya.
So the underlying issue is how we account for the gain for degree 1
vertices. We say that the gain contribution from any vertex is at least 2, i.e.
$\max\left(2, \frac{|N_s(v)|}{4}\right)$. This forces us to insert degree 1
vertices which was intentional. The problem is we can get at most +1
contribution from them to $g_{tot}$. Mayya tried it and when we correct the
gain to $\min\left(\max\left(2, \frac{|N_s(v)|}{4}\right), |N_s(v)|\right)$ we
terminate before exhausting the heap, as expected.
Arguably this is slightly more conceptually correct. However, it represents
a change in behaviour and invalidates the testing Mayya's done. The current
behaviour increases the relative size of $J$ for graphs with many degree one
vertices. This seems conceptually ok. In the case the heap is exhausted we've
just picked $J = V$ the set of all vertices and reverted to the original merge
procedure which is clearly harmless. However, this is going to be an extreme
edge case since we normally only have a small fraction of degree one vertices
and we'll nearly always terminate on $g_{tot}<g_{exit}$.
Therefore, my vote is to just go with Ben's fix and leave the $g_{tot}$
calculation as is.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]