n memory and there is no spill
> > over to disk, it might not be a big issue (ofcourse there will still be
> > memory, CPU and network overhead/latency).
> >
> > If you are looking at storing the data on disk (e.g. as part of a
> checkpoint
> > or explicit storage), t
nt
> or explicit storage), then there can be substantial I/O activity.
>
>
>
>
>
>
>
> From: Xi Shen
> Date: Monday, October 17, 2016 at 2:54 AM
> To: Divya Gehlot , Mungeol Heo
>
> Cc: "user @spark"
> Subject: Re: Is spark a right tool for updati
17, 2016 at 2:54 AM
To: Divya Gehlot , Mungeol Heo
Cc: "user @spark"
Subject: Re: Is spark a right tool for updating a dataframe repeatedly
I think most of the "big data" tools, like Spark and Hive, are not designed to
edit data. They are only designed to query data. I won
I think most of the "big data" tools, like Spark and Hive, are not designed
to edit data. They are only designed to query data. I wonder in what
scenario you need to update large volume of data repetitively.
On Mon, Oct 17, 2016 at 2:00 PM Divya Gehlot
wrote:
> If my understanding is correct a
If my understanding is correct about your query
In spark Dataframes are immutable , cant update the dataframe.
you have to create a new dataframe to update the current dataframe .
Thanks,
Divya
On 17 October 2016 at 09:50, Mungeol Heo wrote:
> Hello, everyone.
>
> As I mentioned at the tile,