Spark json data - avro schema validation

2017-11-11 Thread Barath Ramamoorthy
Hi I have a spark streaming application which receives logs that has encoded json in it. The json complies to a avro schema and part of the process I m converting the json to a data class which of course is a row in dataset. It’s a nested object indeed. In this scenario I m looking to validate

Re: is there a way for removing hadoop from spark

2017-11-11 Thread yohann jardin
Hey Cristian, You don’t need to remove anything. Spark has a standalone mode. Actually that’s the default. https://spark.apache.org/docs/latest/spark-standalone.html When building Spark (and you should build it yourself), just use the option that suits you: https://spark.apache.org/docs/latest/

is there a way for removing hadoop from spark

2017-11-11 Thread Cristian Lorenzetto
Considering the case i neednt hdfs, it there a way for removing completely hadoop from spark? Is YARN the unique dependency in spark? is there no java or scala (jdk langs)YARN-like lib to embed in a project instead to call external servers? YARN lib is difficult to customize? I made different ques

Re: how to replace hdfs with a custom distributed fs ?

2017-11-11 Thread Reynold Xin
You can implement the Hadoop FileSystem API for your distributed java fs and just plug into Spark using the Hadoop API. On Sat, Nov 11, 2017 at 9:37 AM, Cristian Lorenzetto < cristian.lorenze...@gmail.com> wrote: > hi i have my distributed java fs and i would like to implement my class > for sto

how to replace hdfs with a custom distributed fs ?

2017-11-11 Thread Cristian Lorenzetto
hi i have my distributed java fs and i would like to implement my class for storing data in spark. How to do? it there a example how to do?

Re: Some Spark MLLIB tests failing due to some classes not being registered with Kryo

2017-11-11 Thread Jorge Sánchez
No luck running the full test suites with mvn test from the main folder or just mvn -pl mllib. Any other suggestion would be much appreciated. Thank you. 2017-11-11 12:46 GMT+00:00 Marco Gaido : > Hi Jorge, > > then try running the tests not from the mllib folder, but on Spark base > directory.

Re: Some Spark MLLIB tests failing due to some classes not being registered with Kryo

2017-11-11 Thread Marco Gaido
Hi Jorge, then try running the tests not from the mllib folder, but on Spark base directory. If you want to run only the tests in mllib, you can specify the project using the -pl argument of mvn. Thanks, Marco 2017-11-11 13:37 GMT+01:00 Jorge Sánchez : > Hi Marco, > > Just mvn test from the m

Some Spark MLLIB tests failing due to some classes not being registered with Kryo

2017-11-11 Thread Jorge Sánchez
Hi Dev, I'm running the MLLIB tests in the current Master branch and the following Suites are failing due to some classes not being registered with Kryo: org.apache.spark.mllib.MatricesSuite org.apache.spark.mllib.VectorsSuite org.apache.spark.ml.InstanceSuite I can solve it by registering the f