Re: reading ORC format on Spark-SQL

2016-02-11 Thread Philip Lee
d, its subsidiaries or their employees, unless expressly so stated. It is >> the responsibility of the recipient to ensure that this email is virus >> free, therefore neither Peridale Technology Ltd, its subsidiaries nor their >> employees accept any responsibility. >> >> >

Re: reading ORC format on Spark-SQL

2016-02-10 Thread Philip Lee
any responsibility. > > > > > > > > -----Original Message----- > From: Gopal Vijayaraghavan [mailto:go...@hortonworks.com] On Behalf Of > Gopal Vijayaraghavan > Sent: 10 February 2016 21:43 > To: user@hive.apache.org > Subject: Re: reading ORC format on Spark-

RE: reading ORC format on Spark-SQL

2016-02-10 Thread Mich Talebzadeh
aghavan Sent: 10 February 2016 21:43 To: user@hive.apache.org Subject: Re: reading ORC format on Spark-SQL > The reason why I am asking this kind of question is reading csv file on >Spark is linearly increasing as the data size increase a bit, but reading >ORC format on Spark-SQL

Re: reading ORC format on Spark-SQL

2016-02-10 Thread Gopal Vijayaraghavan
> The reason why I am asking this kind of question is reading csv file on >Spark is linearly increasing as the data size increase a bit, but reading >ORC format on Spark-SQL is still same as the data size increses in >. ... > This cause is from (just property of reading ORC format) or (creating >t

RE: reading ORC format on Spark-SQL

2016-02-10 Thread Mich Talebzadeh
Hi, Are you encountering an issue with an ORC file in Spark-sql as opposed to reading the same ORC with Hive on Spark engine? The only difference would with the Spark Optimizer AKA (Catalyst) using an Orc file compared to Hive optimiser doing the same thing. Please clarify the underly