Re: Spark Structured streaming - Kakfa - slowness with query 0

KhajaAsmath Mohammed Wed, 21 Oct 2020 02:36:39 -0700

Thanks. Do we have option to limit number of records ? Like process only 10000 
or the property we pass ? This way we can handle the amount of the data for 
batches that we need .


Sent from my iPhone

> On Oct 21, 2020, at 12:11 AM, lec ssmi <[email protected]> wrote:
> 
> 
>     Structured streaming's  bottom layer also uses a micro-batch mechanism. 
> It seems that the first batch is slower than  the latter, I also often 
> encounter this problem. It feels related to the division of batches. 
>    Other the other hand, spark's batch size is usually bigger than flume 
> transaction bache size. 
> 
> 
> KhajaAsmath Mohammed <[email protected]> 于2020年10月21日周三 下午12:19写道：
>> Yes. Changing back to latest worked but I still see the slowness compared to 
>> flume. 
>> 
>> Sent from my iPhone
>> 
>>>> On Oct 20, 2020, at 10:21 PM, lec ssmi <[email protected]> wrote:
>>>> 
>>> 
>>> Do you start your application  with  chasing the early Kafka data  ? 
>>> 
>>> Lalwani, Jayesh <[email protected]> 于2020年10月21日周三 上午2:19写道：
>>>> Are you getting any output? Streaming jobs typically run forever, and keep 
>>>> processing data as it comes in the input. If a streaming job is working 
>>>> well, it will typically generate output at a certain cadence
>>>> 
>>>>  
>>>> 
>>>> From: KhajaAsmath Mohammed <[email protected]>
>>>> Date: Tuesday, October 20, 2020 at 1:23 PM
>>>> To: "user @spark" <[email protected]>
>>>> Subject: [EXTERNAL] Spark Structured streaming - Kakfa - slowness with 
>>>> query 0
>>>> 
>>>>  
>>>> 
>>>> CAUTION: This email originated from outside of the organization. Do not 
>>>> click links or open attachments unless you can confirm the sender and know 
>>>> the content is safe.
>>>> 
>>>>  
>>>> 
>>>> Hi,
>>>> 
>>>>  
>>>> 
>>>> I have started using spark structured streaming for reading data from kaka 
>>>> and the job is very slow. Number of output rows keeps increasing in query 
>>>> 0 and the job is running forever. any suggestions for this please? 
>>>> 
>>>>  
>>>> 
>>>> <image001.png>
>>>>  
>>>> 
>>>> Thanks,
>>>> 
>>>> Asmath

Re: Spark Structured streaming - Kakfa - slowness with query 0

Reply via email to