Hi,
I'm facing a very strange error that occurs halfway of long execution Spark
SQL jobs:
18/01/12 22:14:30 ERROR Utils: Aborting task
java.io.EOFException: reached end of stream after reading 0 bytes; 96 bytes
expected
at org.spark_project.guava.io.ByteStreams.readFully(ByteStreams.java:735)
at
Did you consider do string processing to build the SQL expression which you
can execute with spark.sql(...)?
Some examples:
https://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables
Cheers
On 21 November 2017 at 03:27, Aakash Basu
wrote:
> Hi all,
>
> Any help? PFB.
>
> Thanks
hat and then read it again and get
> your stats?
>
> On Fri, 17 Nov 2017, 10:03 Fernando Pereira, wrote:
>
>> Dear Spark users
>>
>> Is it possible to take the output of a transformation (RDD/Dataframe) and
>> feed it to two independent transformations without rec
Dear Spark users
Is it possible to take the output of a transformation (RDD/Dataframe) and
feed it to two independent transformations without recalculating the first
transformation and without caching the whole dataset?
Consider the case of a very large dataset (1+TB) which suffered several
trans