OK the number of partitions n or more to the point the "optimum" no of
partitions depends on the size of your batch data DF among other things and
the degree of parallelism at the end point where you will be writing to
sink. If you require high parallelism because your tasks are fine grained,
then
Is this the point you are trying to implement?
I have state data source which enables the state in SS --> Structured
Streaming to be rewritten, which enables repartitioning, schema
evolution, etc via batch query. The writer requires hash partitioning
against group key, with the "desired number of