I'm looking into this a bit further, thanks for bringing it up! Right now
the LSH implementation only uses OR-amplification. The practical
consequence of this is that it will select too many candidates when doing
approximate near neighbor search and approximate similarity join. When we
add AND-ampl
In Spark.ML the coefficients are not "pivoted" meaning that they do not set
one of the coefficient sets equal to zero. You can read more about it here:
https://en.wikipedia.org/wiki/Multinomial_logistic_regression#As_a_set_of_independent_binary_regressions
You can translate your set of coefficient