Re: External Data Source in Spark

2015-03-05 Thread Michael Armbrust
One other caveat: While writing up this example I realized that we make SparkPlan private and we are already packaging 1.3-RC3... So you'll need a custom build of Spark for this to run. We'll fix this in the next release. On Thu, Mar 5, 2015 at 5:26 PM, Michael Armbrust wrote: > Currently we hav

Re: External Data Source in Spark

2015-03-05 Thread Michael Armbrust
> > Currently we have implemented External Data Source API and are able to > push filters and projections. > > Could you provide some info on how perhaps the joins could be pushed to > the original Data Source if both the data sources are from same database > *.* > First a disclaimer: This is an

Re: External Data Source in Spark

2015-03-02 Thread Akhil Das
Wouldn't it be possible with .saveAsNewHadoopAPIFile? How are you pushing the filters and projections currently? Thanks Best Regards On Tue, Mar 3, 2015 at 1:11 AM, Addanki, Santosh Kumar < santosh.kumar.adda...@sap.com> wrote: > Hi Colleagues, > > > > Currently we have implemented External Da

External Data Source in Spark

2015-03-02 Thread Addanki, Santosh Kumar
Hi Colleagues, Currently we have implemented External Data Source API and are able to push filters and projections. Could you provide some info on how perhaps the joins could be pushed to the original Data Source if both the data sources are from same database Briefly looked at DataSourceStra

Re: External Data Source in SPARK

2015-02-09 Thread Michael Armbrust
You need to pass the fully qualified class name as the argument to USING. Nothing special should be required to make it work for python. On Mon, Feb 9, 2015 at 10:21 AM, Addanki, Santosh Kumar < santosh.kumar.adda...@sap.com> wrote: > Hi, > > > > We implemented an External Data Source by extendi

External Data Source in SPARK

2015-02-09 Thread Addanki, Santosh Kumar
Hi, We implemented an External Data Source by extending the TableScan . We added the classes to the classpath The data source works fine when run in Spark Shell . But currently we are unable to use this same data source in Python Environment. So when we execute the following below in an Ipython