date:20190917

Re: intermittent Kryo serialization failures in Spark

2019-09-17 Thread Vadim Semenov

Pre-register your classes: ``` import com.esotericsoftware.kryo.Kryo import org.apache.spark.serializer.KryoRegistrator class MyKryoRegistrator extends KryoRegistrator { override def registerClasses(kryo: Kryo): Unit = { kryo.register(Class.forName("[[B")) // byte[][] kryo.register(clas

Re: custom rdd - do I need a hadoop input format?

2019-09-17 Thread Arun Mahadevan

You can do it with custom RDD implementation. You will mainly implement "getPartitions" - the logic to split your input into partitions and "compute" to compute and return the values from the executors. On Tue, 17 Sep 2019 at 08:47, Marcelo Valle wrote: > Just to be more clear about my requireme

Re: custom rdd - do I need a hadoop input format?

2019-09-17 Thread Marcelo Valle

Just to be more clear about my requirements, what I have is actually a custom format, with header, summary and multi line blocks. I want to create tasks per block and no per line.I already have a library that reads an InputStream and outputs an Iterator of Block, but now I need to integrate this wi

custom rdd - do I need a hadoop input format?

2019-09-17 Thread Marcelo Valle

Hi, I want to create a custom RDD which will read n lines in sequence from a file, which I call a block, and each block should be converted to a spark dataframe to be processed in parallel. Question - do I have to implement a custom hadoop input format to achieve this? Or is it possible to do it

Can I set the Alluxio WriteType in Spark applications?

2019-09-17 Thread Mark Zhao

Hi, If Spark applications write data into alluxio, can WriteType be configured? Thanks, Mark

Re: intermittent Kryo serialization failures in Spark

2019-09-17 Thread Jerry Vinokurov

Hi folks, Posted this some time ago but the problem continues to bedevil us. I'm including a (slightly edited) stack trace that results from this error. If anyone can shed any light on what exactly is happening here and what we can do to avoid it, that would be much appreciated. org.apache.spark.

How to Integrate Spark mllib Streaming Training Models To Spark Structured Streaming

2019-09-17 Thread Praful Rana

Spark mllib library Streaming Training models work with DStream. So is there any way to use them with spark structured streaming.

How to integrates MLeap to Spark Structured Streaming

2019-09-17 Thread Praful Rana

So I am trying to integrate MLeap to spark structured streaming, But facing a problem. As the Spark structured Streaming with Kafka works with data frames and for MLeap LeapFrame is required. So I tried to convert data frame to leapframe using mleap spark support library function (toSparkLeapFrame)

Re: Re: how can I dynamic parse json in kafka when using Structured Streaming

2019-09-17 Thread lk_spark

I want to parse the Struct of data dynamically , then write data to delta lake , I think it can automatically merge scheme. 2019-09-17 lk_spark 发件人：Tathagata Das 发送时间：2019-09-17 16:13 主题：Re: how can I dynamic parse json in kafka when using Structured Streaming 收件人："lk_spark" 抄送："user.spar

Re: how can I dynamic parse json in kafka when using Structured Streaming

2019-09-17 Thread Tathagata Das

You can use *from_json* built-in SQL function to parse json. https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/functions.html#from_json-org.apache.spark.sql.Column-org.apache.spark.sql.Column- On Mon, Sep 16, 2019 at 7:39 PM lk_spark wrote: > hi,all : > I'm using Structured

Re: intermittent Kryo serialization failures in Spark

Re: custom rdd - do I need a hadoop input format?

Re: custom rdd - do I need a hadoop input format?

custom rdd - do I need a hadoop input format?

Can I set the Alluxio WriteType in Spark applications?

Re: intermittent Kryo serialization failures in Spark

How to Integrate Spark mllib Streaming Training Models To Spark Structured Streaming

How to integrates MLeap to Spark Structured Streaming

Re: Re: how can I dynamic parse json in kafka when using Structured Streaming

Re: how can I dynamic parse json in kafka when using Structured Streaming

10 matches

Site Navigation

Mail list logo

Footer information