Hello Sean,
Thank you for the heads-up !
Interaction transform won't help for my use case as it returns a vector
that I won't be able to hash.
I will definitely dig further into custom transformations though.
Thanks !
David
Le ven. 1 oct. 2021 à 15:49, Sean Owen a écrit :
> Are you looking for
Are you looking for
https://spark.apache.org/docs/latest/ml-features.html#interaction ? That's
the closest built in thing I can think of. Otherwise you can make custom
transformations.
On Fri, Oct 1, 2021, 8:44 AM David Diebold wrote:
> Hello everyone,
>
> In MLLib, I’m trying to rely essential
Hello everyone,
In MLLib, I’m trying to rely essentially on pipelines to create features
out of the Titanic dataset, and show-case the power of feature hashing. I
want to:
- Apply bucketization on some columns (QuantileDiscretizer is fine)
- Then I want to cross all my columns