Hi all,
I have a manually created schema using which I am loading data from
multiple csv files to a dataframe.
Now, if there are certain records that fail the provided schema, is there a
way to get those rejected records and continue with the process of loading
data into the dataframe?
As of now,
Has anyone come across involving Depth First Search in Spark GraphX?
Just wondering if that could be possible with Spark GraphX. I searched a
lot. But found just results of BFS. If someone have an idea about it, please
share with me. I would love to learn about it's possibility in Spark GraphX.
Hello,
I am reading data from HDFS in a Spark application and as far as I read
each HDFS block is 1 partition for Spark by default. Is there any way to
select only 1 block from HDFS to read in my Spark application?
Thank you,
Thodoris
You might want to check with the spark-on-k8s
Or try using kubernetes from the official spark 2.3.0 release. (Yes we don't
have an official docker image though but you can build with the script)
From: Rico Bergmann
Sent: Wednesday, April 11, 2018 11:02:38 PM
To: