Thanks Hao. I have ready made it extends HadoopFsRelation and it works.
Will create a jira for that.

Besides that, I noticed that in DataSourceStrategy, spark build physical
plan based on the trait of the BaseRelation in pattern matching (e.g.
CatalystScan, TableScan, HadoopFsRelation). That means the order matters. I
think it is risky because that means one BaseRelation can't extends more
than 2 of these traits. And seems there's no place to restrict to extends
more than 2 traits. Maybe needs to clean and reorganize these traits
otherwise user may meets some weird issue when developing new DataSource.



On Thu, Nov 5, 2015 at 1:16 PM, Cheng, Hao <hao.ch...@intel.com> wrote:

> Probably 2 reasons:
>
> 1.      HadoopFsRelation was introduced since 1.4, but seems CsvRelation
> was created based on 1.3
>
> 2.      HadoopFsRelation introduces the concept of Partition, which
> probably not necessary for LibSVMRelation.
>
>
>
> But I think it will be easy to change as extending from HadoopFsRelation.
>
>
>
> Hao
>
>
>
> *From:* Jeff Zhang [mailto:zjf...@gmail.com]
> *Sent:* Thursday, November 5, 2015 10:31 AM
> *To:* dev@spark.apache.org
> *Subject:* Why LibSVMRelation and CsvRelation don't extends
> HadoopFsRelation ?
>
>
>
>
>
> Not sure the reason,  it seems LibSVMRelation and CsvRelation can extends
> HadoopFsRelation and leverage the features from HadoopFsRelation.  Any
> other consideration for that ?
>
>
>
>
>
> --
>
> Best Regards
>
> Jeff Zhang
>



-- 
Best Regards

Jeff Zhang

Reply via email to