You might be able to do this with multiple aggregations on avg(col("col1")
== "cat1") etc, but how about pivoting the DataFrame first so that you get
columns like "cat1" being 1 or 0? you would end up with columns x
categories new columns if you want to count all categories in all cols. But
then it
Hello everyone,
I am trying to apply moving average on categorical data like below, which
is a synthetic data generated by myself.
sqltimestamp,col1,col2,col3,col4,col5
1618574879,cat1,cat4,cat2,cat5,cat3
1618574880,cat1,cat3,cat4,cat2,cat5
1618574881,cat5,cat3,cat4,cat2,cat1
1618574882,cat2,