Hi, Sean.
I've added a comment in the new class to suggest a look at Hyperopt etc if
the user is using Python.
Anyway I've created a pull request:
https://github.com/apache/spark/pull/31535
and all tests, style checks etc pass. Wish me luck :)
And thanks for the support :)
Phillip
On Mon,
It seems pretty reasonable to me. If it's a pull request we can code review
it.
My only question is just, would it be better to tell people to use
hyperopt, and how much better is this than implementing randomization on
the grid.
But the API change isn't significant so maybe just fine.
On Mon, Feb
Hi, Sean.
I don't think sampling from a grid is a good idea as the min/max may lie
between grid points. Unconstrained random sampling avoids this problem. To
this end, I have an implementation at:
https://github.com/apache/spark/compare/master...PhillHenry:master
It is unit tested and does not c
I was thinking ParamGridBuilder would have to change to accommodate a
continuous range of values, and that's not hard, though other code wouldn't
understand that type of value, like the existing simple grid builder.
It's all possible just wondering if simply randomly sampling the grid is
enough. Th
Hi, Sean.
Perhaps I don't understand. As I see it, ParamGridBuilder builds an
Array[ParamMap]. What I am proposing is a new class that also builds an
Array[ParamMap] via its build() method, so there would be no "change in the
APIs". This new class would, of course, have methods that defined the
se
I think that's a bit orthogonal - right now you can't specify continuous
spaces. The straightforward thing is to allow random sampling from a big
grid. You can create a geometric series of values to try, of course -
0.001, 0.01, 0.1, etc.
Yes I get that if you're randomly choosing, you can randomly
Thanks, Sean! I hope to offer a PR next week.
Not sure about a dependency on the grid search, though - but happy to hear
your thoughts. I mean, you might want to explore logarithmic space evenly.
For example, something like "please search 1e-7 to 1e-4" leads to a
reasonably random sample being {3
I don't know of anyone working on that. Yes I think it could be useful. I
think it might be easiest to implement by simply having some parameter to
the grid search process that says what fraction of all possible
combinations you want to randomly test.
On Fri, Jan 29, 2021 at 5:52 AM Phillip Henry