Yes yes true. I just wonder if somebody took measurements for all different
types of problems in the Big Data area and created some scientific analysis how
much time is wasted on serialization deserialization to support the figure of
80% ;)
> On 24 Jun 2016, at 10:35, Jacek Laskowski wrote
Hello,
The question came from the point that dataframe uses tungsten
improvements with usage of catalyst optimizer. So there would be some
additional work spark does to convert an RDD to dataframe to use the
optimizations/improvements available to dataframes.
Regards,
Pranav
On Fri, Jun 24, 20
Hi Jorn,
You can measure the time for ser/deser yourself using web UI or SparkListeners.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Fri, Jun 24, 2016 at 10:14
I would push the Spark people to provide equivalent functionality . In the end
it is a deserialization/serialization process which should not be done back and
forth because it is one of the more costly aspects during processing. It needs
to convert Java objects to a binary representation. It is
Hi,
I do not profess at all that this this reply has any correlation with the
advanced people :)
However, in general a Data Frame adds the two-dimensional structure (table)
to RDD which is basically a construct that cannot be optimised due to
non-schema structure of RDD.
Now converting RDD to DF
Hi,
I've been asking a similar question myself too! Thanks for sending it to
the mailing list!
Going from a RDD to a Dataset triggers a job to calculate a schema (unless
the RDD is RDD[Row]).
I *think* that transitioning from a Dataset to a RDD is almost a no op
since a Dataset requires more to
Hello,
I am trying to understand the cost of converting an RDD to Dataframe and
back. Would a conversion back and forth very frequently cost performance.
I do observe that some operations like join are implemented very differently
for RDD (pair) and Dataframe so trying to figure out the cose of