Re: ORC with Map Column Type using Hive 0.11.0-RC1

2013-05-03 Thread Andrew Psaltis
Owen, Thanks for the quick response, filing the Jira, and explaining the way the reading of the columns works. RE: > Ironically, doing select * from the table works fine. I just retested this to make sure I was not mistaken and indeed it works. The only thing I can gather is there is a differen

Re: complex types and ORC

2013-05-03 Thread Owen O'Malley
On Mon, Apr 29, 2013 at 4:26 PM, Sean McNamara wrote: > If I create a table that has a map field, will ORC files > columnarize by the keys in the map? Or will all the pairs in the map be > grouped together? > It will break the map keys into one sub-column and the map values into a separate sub-

Re: ORC with Map Column Type using Hive 0.11.0-RC1

2013-05-03 Thread Owen O'Malley
On Fri, May 3, 2013 at 10:20 AM, Andrew Psaltis < andrew.psal...@webtrends.com> wrote: > Hello, > I am trying to evaluate Hive 0.11.0-RC1, in particular I am very > interested in the ORC storage mechanism. We have a need to have one column > be a Map in a table and from what I have read this is >

Re: hive cli escaping TAB and NEW LINE Characters.

2013-05-03 Thread Sanjay Subramanian
+1 to Stephens suggestion… From: Stephen Sprague mailto:sprag...@gmail.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>> Date: Friday, May 3, 2013 11:29 AM To: "user@hive.apache.org" mailto:user@hive.apache.org>> Subjec

Re: hive cli escaping TAB and NEW LINE Characters.

2013-05-03 Thread Stephen Sprague
hate to sound like a broken record but when all else fails think about the transform() function. The notion here is of encoding your tabs and newlines to something like '\t' and '\n' (literally) for instance. If those aren't unique enough use '<>' and "<>' (you get the idea) then having your app d

Re: Hive struct

2013-05-03 Thread Stephen Sprague
ok. i'll bite. when you say "struct" do you mean that literally? if so then you know the first and last elements _by definition_. so that's easy and i presume you can solve that. other than that if you mean "map" or "array" i think you're not going to do it in native hive. For this I'd person

ORC with Map Column Type using Hive 0.11.0-RC1

2013-05-03 Thread Andrew Psaltis
Hello, I am trying to evaluate Hive 0.11.0-RC1, in particular I am very interested in the ORC storage mechanism. We have a need to have one column be a Map in a table and from what I have read this is supported with the ORC format, however when trying to do a select on a table with a Map column

hive cli escaping TAB and NEW LINE Characters.

2013-05-03 Thread Valluri, Sathish
Hi All, We have an application which parses hive cli output and displays results. I have an external table with data in avro format, the contents in this avro file have TAB and NEW LINES in the Avro data part. Since hive cli output rows are delimited by NEWLINES and columns are delimited by

Re: Parallely Load Data into Two partitions of a Hive Table

2013-05-03 Thread Sanjay Subramanian
Why are u using LOAD DATA syntax ? Are these Hive managed tables ? LOAD DATA will actually copy files into HDFS I would recommend using EXTERNAL table and use ALTER TABLE ADD PARTITION(logdate='2013-04-01') LOCATION '/logs/processed/2013-04-01' This just makes entries in MYSQL and is lot faste

Re: Parallely Load Data into Two partitions of a Hive Table

2013-05-03 Thread selva
Only thing i was confused that mysql meta update may have lock and it might cause any data loss. Now i am clear on this. Thanks a lot Nitin. On Fri, May 3, 2013 at 1:06 PM, Nitin Pawar wrote: > Why dont you load all of your data into a temporary table and then from > there to your current tabl

Re: Parallely Load Data into Two partitions of a Hive Table

2013-05-03 Thread Nitin Pawar
Why dont you load all of your data into a temporary table and then from there to your current tables. Hive will take care of adding dynic partitions and that will remove the ocerhead from you. To answer your question, you can always load data in different partitions parallely as long as you have

Parallely Load Data into Two partitions of a Hive Table

2013-05-03 Thread selva
Hi All, I need to load a month worth of processed data into a hive table. Table have 10 partitions. Each day have many files to load and each file is taking two seconds(constantly) and i have ~3000 files). So it will take days to complete for 30 days worth of data. I planned to load every day dat