Hi Anoop,
I don't see the exception you mentioned in the link. I can use spark-avro
to read the sample file users.avro in spark successfully. Do you have the
details of the union issue ?
On Sat, Feb 27, 2016 at 10:05 AM, Anoop Shiralige wrote:
> Hi Jeff,
>
> Thank you for looking into the pos
Hi Jeff,
Thank you for looking into the post.
I had explored spark-avro option earlier. Since, we have union of multiple
complex data types in our avro schema we couldn't use it.
Couple of things I tried.
-
https://stackoverflow.com/questions/31261376/how-to-read-pyspark-avro-file-and-ext
Avro Record is not supported by pickler, you need to create a custom
pickler for it. But I don't think it worth to do that. Actually you can
use package spark-avro to load avro data and then convert it to RDD if
necessary.
https://github.com/databricks/spark-avro
On Thu, Feb 11, 2016 at 10:38 P
Hi All,
I am working with Spark 1.6.0 and pySpark shell specifically. I have an
JavaRDD[org.apache.avro.GenericRecord] which I have converted to pythonRDD
in the following way.
javaRDD = sc._jvm.java.package.loadJson("path to data", sc._jsc)
javaPython = sc._jvm.SerDe.javaToPython(javaRDD)
from