Re: reading ORC format on Spark-SQL

2016-02-11 Thread Philip Lee
d, its subsidiaries or their employees, unless expressly so stated. It is >> the responsibility of the recipient to ensure that this email is virus >> free, therefore neither Peridale Technology Ltd, its subsidiaries nor their >> employees accept any responsibility. >> >> >

Re: reading ORC format on Spark-SQL

2016-02-10 Thread Philip Lee
any responsibility. > > > > > > > > -----Original Message----- > From: Gopal Vijayaraghavan [mailto:go...@hortonworks.com] On Behalf Of > Gopal Vijayaraghavan > Sent: 10 February 2016 21:43 > To: user@hive.apache.org > Subject: Re: reading ORC format on Spark-

RE: reading ORC format on Spark-SQL

2016-02-10 Thread Mich Talebzadeh
aghavan Sent: 10 February 2016 21:43 To: user@hive.apache.org Subject: Re: reading ORC format on Spark-SQL > The reason why I am asking this kind of question is reading csv file on >Spark is linearly increasing as the data size increase a bit, but reading >ORC format on Spark-SQL

Re: reading ORC format on Spark-SQL

2016-02-10 Thread Gopal Vijayaraghavan
> The reason why I am asking this kind of question is reading csv file on >Spark is linearly increasing as the data size increase a bit, but reading >ORC format on Spark-SQL is still same as the data size increses in >. ... > This cause is from (just property of reading ORC forma

RE: reading ORC format on Spark-SQL

2016-02-10 Thread Mich Talebzadeh
From: Philip Lee [mailto:philjj...@gmail.com] Sent: 10 February 2016 20:39 To: user@hive.apache.org Subject: reading ORC format on Spark-SQL What kind of steps exists when reading ORC format on Spark-SQL? I meant usually reading csv file is just directly reading the dataset on memory.

reading ORC format on Spark-SQL

2016-02-10 Thread Philip Lee
What kind of steps exists when reading ORC format on Spark-SQL? I meant usually reading csv file is just directly reading the dataset on memory. But I feel like Spark-SQL has some steps when reading ORC format. For example, they have to create table to insert the dataset? and then they insert the