Re: new datasource

2015-11-19 Thread Michael Armbrust
> > *From:* Cheng, Hao [mailto:hao.ch...@intel.com] > *Sent:* 19 November 2015 15:30 > *To:* Green, James (UK Guildford); dev@spark.apache.org > *Subject:* RE: new datasource > > > > I think you probably need to write some code as you need to support the > ES, there

RE: new datasource

2015-11-19 Thread james.gre...@baesystems.com
] Sent: 19 November 2015 15:30 To: Green, James (UK Guildford); dev@spark.apache.org Subject: RE: new datasource I think you probably need to write some code as you need to support the ES, there are 2 options per my understanding: Create a new Data Source from scratch, but you probably need to

RE: new datasource

2015-11-19 Thread Cheng, Hao
/sql/sources/interfaces.scala#L751 Or you can reuse most of code in ParquetRelation in the new DataSource, but also need to modify your own logic, see https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRelation.scala#L285

new datasource

2015-11-19 Thread james.gre...@baesystems.com
We have written a new Spark DataSource that uses both Parquet and ElasticSearch. It is based on the existing Parquet DataSource. When I look at the filters being pushed down to buildScan I don’t get anything representing any filters based on UDFs – or for any fields generated by an explode