RE: Ask about Pyspark ML interaction

2020-11-09 Thread Du, Yi
ing? Thanks, From: Sean Owen [mailto:sro...@gmail.com] Sent: Monday, November 9, 2020 9:58 AM To: Du, Yi Cc: user@spark.apache.org Subject: Re: Ask about Pyspark ML interaction CAUTION: External email. I think you have this flipped around - you want to one-hot encode, then compute interactions.

Re: Ask about Pyspark ML interaction

2020-11-09 Thread Sean Owen
I think you have this flipped around - you want to one-hot encode, then compute interactions. As it is you are treating the product of {0,1,2,3,4} x {0,1,2,3,4} as if it's a categorical index. That doesn't have nearly 25 possible values and probably is not what you intend. On Mon, Nov 9, 2020 at 7