Re: Hive Cli ORC table read error with limit option

2016-03-06 Thread Biswajit Nayak
Hi Gopal, I had already pasted the table format in this thread. Will repeat it again. *hive> desc formatted *testdb.table_orc*;* *OK* *# col_name data_typecomment * *row_id bigint * *a int

Re: Updating column in table throws error

2016-03-06 Thread Mich Talebzadeh
Hi, This update will throw an error as any column used for bucketing (read for hash partitioning) cannot be updated as it is used for physical ordering of rows in the table. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: Updating column in table throws error

2016-03-06 Thread Marcin Tustin
Don't bucket on columns you expect to update. Potentially you could delete the whole row and reinsert it. On Sunday, March 6, 2016, Ashok Kumar wrote: > Hi gurus, > > I have an ORC table bucketed on invoicenumber with "transactional"="true" > > I am trying to update invoicenumber column used fo

Updating column in table throws error

2016-03-06 Thread Ashok Kumar
Hi gurus, I have an ORC table bucketed on invoicenumber with "transactional"="true" I am trying to update invoicenumber column used for bucketing this table but it comes back with Error: Error while compiling statement: FAILED: SemanticException [Error 10302]: Updating values of bucketing column

Re: Parquet versus ORC

2016-03-06 Thread Marcin Tustin
If you google, you'll find benchmarks showing each to be faster than the other. In so far as there's any reality to which is faster in any given comparison, it seems to be a result of each incorporating ideas from the other, or at least going through development cycles to beat each other. ORC is v

Re: Parquet versus ORC

2016-03-06 Thread Mich Talebzadeh
Hi, Thanks for that link. It appears that the main advantages of Parquet is stated as and I quote: "Parquet is built to be used by anyone. The Hadoop ecosystem is rich with data processing frameworks, and we are not interested in playing favorites. We believe that an efficient, well-implemented

Re: Parquet versus ORC

2016-03-06 Thread Uli Bethke
Curious why you think that Parquet does not have metadat at file, row group or column level. Please refer here to the type of metadata that Parquet supports in the docs http://parquet.apache.org/documentation/latest/ n 06/03/2016 15:26, Mich Talebzadeh wrote: Hi. I have been hearing a fair bi

Parquet versus ORC

2016-03-06 Thread Mich Talebzadeh
Hi. I have been hearing a fair bit about Parquet versus ORC tables. In a nutshell I can say that Parquet is a predecessor to ORC (both provide columnar type storage) but I notice that it is still being used especially with Spark users. In mitigation it appears that Spark users are reluctant to u

Re: Which one should i use for benchmark tasks in hive & hadoop

2016-03-06 Thread dhruv kapatel
Thank you very much. -- *With Regards:Kapatel Dhruv v*

Re: Which one should i use for benchmark tasks in hive & hadoop

2016-03-06 Thread Jiacai Liu
I have answered this question at stackoverflow.☺ On Sun, Mar 6, 2016 at 1:47 PM, dhruv kapatel wrote: > > > Hi > > I am comparing performance of pig and hive for weblog data. > I was reading this pig and hive benchmarks. In which one statement written > on page 10 that "The CPU time > required b

Re: Problems with building hive from source code

2016-03-06 Thread Jiacai Liu
When I compile a project, error happens now and then, for most time, I just recompile it, then everything get ok, Also, JDK 1.8 may work well with hadoop ecosystem, so I advice try jdk 1.7 instead. On Sun, Mar 6, 2016 at 6:31 PM, Isuru Sankalpa wrote: > When i build hive according to instruct

Problems with building hive from source code

2016-03-06 Thread Isuru Sankalpa
When i build hive according to instructions from https://lens.apache.org/lenshome/install-and-run.html it gives errors when building the project [INFO] Hive HCatalog Server Extensions SUCCESS [ 2.592 s] [INFO] Hive HCatalog Webhcat Java Client .. SUCCESS [ 1