Thanks Hao. I have ready made it extends HadoopFsRelation and it works. Will create a jira for that.
Besides that, I noticed that in DataSourceStrategy, spark build physical plan based on the trait of the BaseRelation in pattern matching (e.g. CatalystScan, TableScan, HadoopFsRelation). That means the order matters. I think it is risky because that means one BaseRelation can't extends more than 2 of these traits. And seems there's no place to restrict to extends more than 2 traits. Maybe needs to clean and reorganize these traits otherwise user may meets some weird issue when developing new DataSource. On Thu, Nov 5, 2015 at 1:16 PM, Cheng, Hao <hao.ch...@intel.com> wrote: > Probably 2 reasons: > > 1. HadoopFsRelation was introduced since 1.4, but seems CsvRelation > was created based on 1.3 > > 2. HadoopFsRelation introduces the concept of Partition, which > probably not necessary for LibSVMRelation. > > > > But I think it will be easy to change as extending from HadoopFsRelation. > > > > Hao > > > > *From:* Jeff Zhang [mailto:zjf...@gmail.com] > *Sent:* Thursday, November 5, 2015 10:31 AM > *To:* dev@spark.apache.org > *Subject:* Why LibSVMRelation and CsvRelation don't extends > HadoopFsRelation ? > > > > > > Not sure the reason, it seems LibSVMRelation and CsvRelation can extends > HadoopFsRelation and leverage the features from HadoopFsRelation. Any > other consideration for that ? > > > > > > -- > > Best Regards > > Jeff Zhang > -- Best Regards Jeff Zhang