Hi list, I am looking for an efficient solution to apply a training pipeline to each group of a DataFrame.groupBy.
This is very easy if you're using a pandas udf (i.e. groupBy().apply()), I am not able to find the equivalent for a spark pipeline. The ultimate goal is to fit multiple models, one per group of data. Thanks,