I'm climbing under the hood in there for SPARK-3250, and I see this:

override def sample(items: Iterator[T]): Iterator[T] = {
  items.filter { item =>
    val x = rng.nextDouble()
    (x >= lb && x < ub) ^ complement
  }
}


The clause (x >= lb && x < ub) is equivalent to (x < ub-lb), which is faster, 
and requires only one parameter (sampling fraction).   Any caller asking for 
BernoulliSampler(a, b) can equally well ask for BernoulliSampler(b-a).

Is there some angle I'm missing?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to