date:20180416

Re: [discuss][data source v2] remove type parameter in DataReader/WriterFactory

2018-04-16 Thread Wenchen Fan

Yea definitely not. The only requirement is, the DataReader/WriterFactory must support at least one DataFormat. > how are we going to express capability of the given reader of its supported format(s), or specific support for each of “real-time data in row format, and history data in columnar form

Re: [discuss][data source v2] remove type parameter in DataReader/WriterFactory

2018-04-16 Thread Felix Cheung

Is it required for DataReader to support all known DataFormat? Hopefully, not, as assumed by the 'throw' in the interface. Then specifically how are we going to express capability of the given reader of its supported format(s), or specific support for each of "real-time data in row format, and

Re: Maintenance releases for SPARK-23852?

2018-04-16 Thread Xiao Li

Yes, it sounds good to me. We can upgrade both Parquet 1.8.2 to 1.8.3 and ORC 1.4.1 to 1.4.3 in our upcoming Spark 2.3.1 release. Thanks for your efforts! @Henry and @Dongjoon Xiao 2018-04-16 14:41 GMT-07:00 Henry Robinson : > Seems like there aren't any objections. I'll pick this thread back u

Re: Maintenance releases for SPARK-23852?

2018-04-16 Thread Henry Robinson

Seems like there aren't any objections. I'll pick this thread back up when a Parquet maintenance release has happened. Henry On 11 April 2018 at 14:00, Dongjoon Hyun wrote: > Great. > > If we can upgrade the parquet dependency from 1.8.2 to 1.8.3 in Apache > Spark 2.3.1, let's upgrade orc depen

Re: Isolate 1 partition and perform computations

2018-04-16 Thread Thodoris Zois

Hello, Thank you very much for your response Anastasie! Today I think I made it through dropping partitions in (runJob or submitJob) - I don’t remember exactly, in DAGScheduler. If it doesn’t work properly after some tests, I will follow your approach. Thank you, Thodoris > On 16 Apr 2018, a

Re: Isolate 1 partition and perform computations

2018-04-16 Thread Anastasios Zouzias

Hi all, I think this is doable using the mapPartitionsWithIndex method of RDD. Example: val partitionIndex = 0 // Your favorite partition index here val rdd = spark.sparkContext.parallelize(Array.range(0, 1000)) // Replace elements of partitionIndex with [-10, .. ,0] val fixed = rdd.mapPartit

Re: [discuss][data source v2] remove type parameter in DataReader/WriterFactory

Re: [discuss][data source v2] remove type parameter in DataReader/WriterFactory

Re: Maintenance releases for SPARK-23852?

Re: Maintenance releases for SPARK-23852?

Re: Isolate 1 partition and perform computations

Re: Isolate 1 partition and perform computations

6 matches

Site Navigation

Mail list logo

Footer information