Hi all,
These days I am learning the code about the StreamExecution.
In the method constructNextBatch(about line 365), I found the value of
latestOffsets changed but I can not find where the s.getOffset of uniqueSource
is changed.
here is the code link:
https://github.com/apache/spark/blob/m
Hi Marco,
I am sincerely obliged for your kind time and response. Can you please try
the solution that you have so kindly suggested?
It will be a lot of help if you could kindly execute the code that I have
given. I dont think that anyone has yet.
There are lots of fine responses to my question
Forkjoinpool with task support would help in this case. Where u can create
a thread pool with configured number of threads ( make sure u have enough
cores) and submit job I mean actions to the pool
On Fri, Aug 4, 2017 at 8:54 AM Raghavendra Pandey <
raghavendra.pan...@gmail.com> wrote:
> Did you t
On Fri, Aug 4, 2017 at 4:36 PM, Jean Georges Perrin wrote:
> Thanks Daniel,
>
> I like your answer for #1. It makes sense.
>
> However, I don't get why you say that there are always pending
> transformations... After you call an action, you should be "clean" from
> pending transformations, no?
>
Thanks Daniel,
I like your answer for #1. It makes sense.
However, I don't get why you say that there are always pending
transformations... After you call an action, you should be "clean" from pending
transformations, no?
> On Aug 3, 2017, at 5:53 AM, Daniel Darabos
> wrote:
>
>
> On Wed,
Did you try SparkContext.addSparkListener?
On Aug 3, 2017 1:54 AM, "Andrii Biletskyi"
wrote:
> Hi all,
>
> What is the correct way to schedule multiple jobs inside foreachRDD method
> and importantly await on result to ensure those jobs have completed
> successfully?
> E.g.:
>
> kafkaDStream.f
I use CIFS and it works reasonably well and easily cross platform, well
documented...
> On Aug 4, 2017, at 6:50 AM, Steve Loughran wrote:
>
>
>> On 3 Aug 2017, at 19:59, Marco Mistroni wrote:
>>
>> Hello
>> my 2 cents here, hope it helps
>> If you want to just to play around with Spark, i'd
> On 3 Aug 2017, at 19:59, Marco Mistroni wrote:
>
> Hello
> my 2 cents here, hope it helps
> If you want to just to play around with Spark, i'd leave Hadoop out, it's an
> unnecessary dependency that you dont need for just running a python script
> Instead do the following:
> - got to the roo
Hi,
We are running a Spark Streaming application with Kafka Direct Stream with
Spark version 1.6.
It has run for few days without any error or failed tasks and then there
was an error creating a directory in one machine as follows:
Job aborted due to stage failure: Task 1 in stage 158757.0 fai