Re: Structured Streaming Kafka change maxOffsetsPerTrigger won't apply

Gabor Somogyi Wed, 20 Nov 2019 03:24:11 -0800

Hi Roland,

Not much shared apart from it's not working. Latest partition offset is
used when the size of a TopicPartition is negative.
This can be found out by checking the following log entry in the logs:


logDebug(s"rateLimit $tp size is $size")

If you've double checked and still think it's an issue please file a jira
and attach Spark configuration + logs.

BR,
G


On Wed, Nov 20, 2019 at 9:33 AM Roland Johann
<roland.joh...@phenetic.io.invalid> wrote:

> Hi All,
>
> changing maxOffsetsPerTrigger and restarting the job won’t apply to the
> batch size. This is somehow bad as we currently use a trigger duration of
> 5minutes which consumes only 100k messages with an offset lag in the
> billions. Decreasing trigger duration affects also micro batch size - but
> its then only a few hundreds. Spark version in use is 2.4.4.
>
> I assume that spark uses previous micro batch sizes and runtimes to
> somehow calculate current batch sizes based on trigger durations. AFAIK
> structured streaming isn’t back pressure aware, so this behavior is strange
> on multiple levels.
>
> Any help appreciated.
>
> Kind Regards
> Roland
>

Re: Structured Streaming Kafka change maxOffsetsPerTrigger won't apply

Reply via email to