Re: Blacklisting in Spark Stateful Structured Streaming

2020-11-20 Thread Eric Beabes
Yes I agree that blacklisting structure can be put in the user-defined state but still the state would remain open for a long time, right? Am I misunderstanding something? I like the idea of blacklisting in a "Broadcast" variable but I can't figure out how to use the "Broadcast" variable in the 'm

Re: Blacklisting in Spark Stateful Structured Streaming

2020-11-16 Thread Yuanjian Li
If you use the `flatMap/mapGroupsWithState` API for a "stateful" SS job, the blacklisting structure can be put into the user-defined state. To use a 3rd-party cache should also be a good choice. Eric Beabes 于2020年11月11日周三 上午6:54写道: > Currently we’ve a “Stateful” Spark Structured Streaming job th

Blacklisting in Spark Stateful Structured Streaming

2020-11-10 Thread Eric Beabes
Currently we’ve a “Stateful” Spark Structured Streaming job that computes aggregates for each ID. I need to implement a new requirement which says that if the no. of incoming messages for a particular ID exceeds a certain value then add this ID to a blacklist & remove the state for it. Going forwar