You could create a custom accumulator using a linkedlist or so.
Some examples that could help:
https://towardsdatascience.com/custom-pyspark-accumulators-310f63ca3c8c
https://stackoverflow.com/questions/34798578/how-to-create-custom-list-accumulator-i-e-listint-int
On Tue, Aug 3, 2021 at 1:23 PM
Hi Team,
We are using rdd.foreach(lambda x : do_something(x))
Our use case requires collecting of the error messages in a list which are
coming up in the exception block of the method do_something.
Since this will be running on executor , a global list won't work here. As
the state needs to be sh