ich is 128MB by default.
>
> Thanks,
> Manu Zhang
> On Feb 25, 2019, 2:58 AM +0800, Akshay Mendole ,
> wrote:
>
> Hi,
>We have dfs.blocksize configured to be 512MB and we have some large
> files in hdfs that we want to process with spark application. We want to
> s
Hi,
We have dfs.blocksize configured to be 512MB and we have some large
files in hdfs that we want to process with spark application. We want to
split the files get more splits to optimise for memory but the above
mentioned parameters are not working
The max and min size params as below are con
, 11:28 pm Ramandeep Singh Nanda Hi,
>
> Did you try increasing concurrentgcthreads for the marking?
>
> System.gc is not a good way to handle this, as it is not guaranteed and is
> a high pause,full gc.
>
> Regards,
> Ramandeep Singh
>
> On Tue, Dec 25, 2018, 07:0
:
> Do you have a lot of small files? Do you use S3 or similar? It could be
> that Spark does some IO related tasks.
>
> > Am 25.12.2018 um 12:51 schrieb Akshay Mendole :
> >
> > Hi,
> > As you can see in the picture below, the application last job
> finished
Hi,
As you can see in the picture below, the application last job
finished at around 13:45 and I could see the output directory updated with
the results. Yet, the application took a total of 20 min more to change the
status. What could be the reason for this? Is this a known fact? The
applica