RE: Table vs View

2013-05-06 Thread Connell, Chuck
I am not sure about speed, but my understanding is that tables are real things, they exist as files on disk and can be reused. Views are temporary entities that are created on-the-fly and cannot be reused later. Chuck From: Peter Chu [mailto:pete@outlook.com] Sent: Monday, May 06, 2013 2:4

RE: Hive Problems Reading Avro+Snappy Data

2013-04-07 Thread Connell, Chuck
When you do SELECT *, Hive does not run a real MapReduce job, so it is not a good test. Something is wrong with SerDe or InputFormat. Chuck From: Thomas, Matthew [mailto:mtho...@verisign.com] Sent: Sunday, April 07, 2013 5:41 PM To: user@hive.apache.org Subject: Hive Problems Reading Avro+Snapp

RE: Hive sample test

2013-03-05 Thread Connell, Chuck
Using the Hive sampling feature would also help. This is exactly what that feature is designed for. Chuck From: Kyle B [mailto:kbi...@gmail.com] Sent: Tuesday, March 05, 2013 1:45 PM To: user@hive.apache.org Subject: Hive sample test Hello, I was wondering if there is a way to quick-verify a

RE: CHAN (Comprehensive Hive Archive Network) (Was Re: Using Reflect: A thread for ideas)

2013-02-14 Thread Connell, Chuck
+1, great idea! Chuck Connell, Nuance From: Robin Morris [mailto:r...@baynote.com] Sent: Thursday, February 14, 2013 1:59 AM To: user@hive.apache.org Subject: CHAN (Comprehensive Hive Archive Network) (Was Re: Using Reflect: A thread for ideas) I think we need to think a little bigger than this

RE: reg : getting table values in inputFormat in serde

2012-12-21 Thread Connell, Chuck
So you have an XML serde? Did you write it? How can others download it? Chuck From: Mohit Chaudhary01 [mohit_chaudhar...@infosys.com] Sent: Friday, December 21, 2012 6:39 AM To: user@hive.apache.org Subject: RE: reg : getting table values in inputFormat in serde

RE: FROM INSERT after ADD COLUMN

2012-12-09 Thread Connell, Chuck
I don't think you can do this. Populating new columns is the same as "row level updates" which Hive does not do. AFAIK, your only option is to write a new table, by reading the old table, selecting all of it, appending new values to each row, then writing the longer rows to a new table. Chuck

RE: BINARY column type

2012-12-02 Thread Connell, Chuck
it to STDOUT. My Insert statement took the results of the pcap parsing script (including the hexed data) and then unhexed it at insert. There may be a better way to do this, but for me it works well. *shrug* On Sun, Dec 2, 2012 at 9:00 AM, Connell, Chuck mailto:chuck.conn...@nuance.com>

RE: BINARY column type

2012-12-02 Thread Connell, Chuck
insert >statement. That allowed me to move the data with newline around easily, but on >the final step (on insert) it would unhex it and put it in as actual binary, >no bytes were harmed in the hexing (or unhexing) of my data. On Sat, Dec 1, 2012 at 4:11 PM, Connell, Chuck mail

RE: BINARY column type

2012-12-01 Thread Connell, Chuck
ll. On Sat, Dec 1, 2012 at 10:50 AM, Connell, Chuck mailto:chuck.conn...@nuance.com>> wrote: I am trying to use BINARY columns and believe I have the perfect use-case for it, but I am missing something. Has anyone used this for true binary data (which may contain newlines)? Here is the

BINARY column type

2012-12-01 Thread Connell, Chuck
I am trying to use BINARY columns and believe I have the perfect use-case for it, but I am missing something. Has anyone used this for true binary data (which may contain newlines)? Here is the background... I have some files that each contain just one logical field, which is a binary object.

RE: Starting with Hive - writing custom SerDe

2012-11-29 Thread Connell, Chuck
I meant PLAIN tab-separated text. From: Connell, Chuck [mailto:chuck.conn...@nuance.com] Sent: Thursday, November 29, 2012 9:51 AM To: user@hive.apache.org Subject: RE: Starting with Hive - writing custom SerDe You might save yourself a lot of work by pre-processing the data, before putting it

RE: Starting with Hive - writing custom SerDe

2012-11-29 Thread Connell, Chuck
You might save yourself a lot of work by pre-processing the data, before putting it into Hive. A Python script should be able to find all the fields, and change the data to plan tab-separated text. This will load directly into Hive, and removes the need to write a custom SerDe. Chuck Connell Nu

RE: Hive - Load file with delimiters

2012-11-28 Thread Connell, Chuck
Plain old tab-separated text is very easy to import to Hive. So I would pre-process the CSV you have now to change it to tab-separated (with no quotes). Chuck Connell Nuance R&D Data Team Burlington, MA From: Mark Grover [mailto:grover.markgro...@gmail.com] Sent: Wednesday, November 28, 2012 11:

RE: Setting auxpath in Hue/Beeswax

2012-11-26 Thread Connell, Chuck
I believe there are some free Cloudera discussion boards/lists. Standard Cloudera support is a paid service. Chuck Connell Nuance R&D Data Team Burlington, MA From: Sadananda Hegde [mailto:saduhe...@gmail.com] Sent: Monday, November 26, 2012 2:42 PM To: user@hive.apache.org Subject: Re: Setting

RE: Do I need any Pig knowledge to learn Hive?

2012-11-16 Thread Connell, Chuck
The Hive query language (HiveQL) is completely different from Pig. Chuck Connell Nuance R&D Data Team From: Majid Azimi [mailto:majid.merk...@gmail.com] Sent: Friday, November 16, 2012 4:27 AM To: user@hive.apache.org Subject: Do I need any Pig knowledge to learn Hive? hi guys, Do I need any P

RE: hive under cygwin

2012-11-13 Thread Connell, Chuck
easily 10 patches post hive 0.9.0X that are for making things that only worked on linux work natively on windows. Baby steps. Edward On Tue, Nov 13, 2012 at 4:44 PM, Connell, Chuck wrote: > This same question comes up about once per week on this list. Anyone who > knows how to type &quo

RE: hive under cygwin

2012-11-13 Thread Connell, Chuck
This same question comes up about once per week on this list. Anyone who knows how to type "hive cygwin" into a search bar will pretty quickly find that Hadoop is made for native Linux. Is it possible to make Hadoop (and Hive) run under Cygwin? Apparently so, with lots of kludges and jumping t

RE: Not able to run queries in Hive

2012-10-29 Thread Connell, Chuck
ndows From: "Connell, Chuck" To: "user@hive.apache.org" ; "v.balakrish...@tcs.com" Sent: Monday, 29 October 2012 6:59 PM Subject: RE: Not able to run queries in Hive Cygwin is not Linux. It does not run native Linux code. __

RE: Not able to run queries in Hive

2012-10-29 Thread Connell, Chuck
Cygwin is not Linux. It does not run native Linux code. From: yogesh.kuma...@wipro.com [yogesh.kuma...@wipro.com] Sent: Monday, October 29, 2012 9:20 AM To: user@hive.apache.org; v.balakrish...@tcs.com Subject: RE: Not able to run queries in Hive Bala, Then I t

RE: Hive installation

2012-10-24 Thread Connell, Chuck
I would just install Cloudera CDH4 and you don't have to do any config at all, super easy. Chuck From: Artem Ervits [are9...@nyp.org] Sent: Wednesday, October 24, 2012 4:41 PM To: user@hive.apache.org Subject: RE: Hive installation Whew, sorry for all the spam,

RE: Writing Custom Serdes for Hive

2012-10-16 Thread Connell, Chuck
A serde is actually used the other way around... Hive parses the query, writes MapReduce code to solve the query, and the generated code uses the serde for field access. Standard way to write a serde is to start from the trunk regex serde, then modify as needed... http://svn.apache.org/viewvc/

RE: Any advice about complex Hive tables?

2012-10-13 Thread Connell, Chuck
your help. Sadu On Fri, Oct 12, 2012 at 9:17 AM, Connell, Chuck mailto:chuck.conn...@nuance.com>> wrote: Sadu, I am using JSON as the input format, with the JSON SerDe from https://github.com/rcongiu/Hive-JSON-Serde. A sample JSON record is: (in actual use each JSON record must be on one

RE: Any advice about complex Hive tables?

2012-10-13 Thread Connell, Chuck
er? Thanks for your help. Sadu On Fri, Oct 12, 2012 at 9:17 AM, Connell, Chuck mailto:chuck.conn...@nuance.com>> wrote: Sadu, I am using JSON as the input format, with the JSON SerDe from https://github.com/rcongiu/Hive-JSON-Serde. A sample JSON record is: (in actual use each JSON record

RE: Any advice about complex Hive tables?

2012-10-12 Thread Connell, Chuck
ail.com] Sent: Thursday, October 11, 2012 11:47 PM To: user@hive.apache.org Subject: Re: Any advice about complex Hive tables? Hi Chuck, I have a similar complex hive tables with many fields and some are nested like array of structs (but only upto 3 levels). How did you define you ROW FORMAT

Any advice about complex Hive tables?

2012-10-08 Thread Connell, Chuck
since everything else about Hive is just what we want. Any suggestions? Workarounds? Thanks, Chuck From: Connell, Chuck Sent: Thursday, October 04, 2012 4:31 PM To: user@hive.apache.org Subject: RE: Limit to columns or nesting of Hive table? The issue appar

RE: Limit to columns or nesting of Hive table?

2012-10-04 Thread Connell, Chuck
-Original Message- From: Connell, Chuck [mailto:chuck.conn...@nuance.com] Sent: Thursday, October 04, 2012 12:09 PM To: user@hive.apache.org Subject: RE: Limit to columns or nesting of Hive table? Thanks. So is the nesting limit 10 now? Does your 2nd paragraph mean that this limit cannot easily

RE: Limit to columns or nesting of Hive table?

2012-10-04 Thread Connell, Chuck
s can case issues. Edward On Thu, Oct 4, 2012 at 11:48 AM, Connell, Chuck wrote: > I am trying to create a large Hive table, with many columns and deeply > nested structs. It is failing with java.lang.ArrayIndexOutOfBoundsException: > 10. > > > > Before I spend a lot of time

Limit to columns or nesting of Hive table?

2012-10-04 Thread Connell, Chuck
I am trying to create a large Hive table, with many columns and deeply nested structs. It is failing with java.lang.ArrayIndexOutOfBoundsException: 10. Before I spend a lot of time debugging my table declaration, is there some limit here I should know about? Max number of columns? Max depth of s

RE: Hive does not run - Typical NoSuchFieldError

2012-10-02 Thread Connell, Chuck
o be a > more explicit declaration to be made but I'm giving up if I can't resolve > this today. > > Anthony > > > On Tue, Oct 2, 2012 at 10:26 AM, Connell, Chuck > mailto:chuck.conn...@nuance.com>> > wrote: >> >> Try the easy way... Cloude

RE: Hive does not run - Typical NoSuchFieldError

2012-10-02 Thread Connell, Chuck
Try the easy way... Cloudera CDH4 running on Centos 5.8. Can install everything on one machine. Chuck From: Anthony Ikeda [mailto:anthony.ikeda@gmail.com] Sent: Tuesday, October 02, 2012 1:23 PM To: user@hive.apache.org Subject: Hive does not run - Typical NoSuchFieldError I've tried diffe

RE: zip file or tar file cosumption

2012-09-30 Thread Connell, Chuck
Yeah, seems like you can’t do that see:http://mail-archives.apache.org/mod_mbox/hive-user/201203.mbox/%3CCAENxBwxkF--3PzCkpz1HX21=gb9yvasr2jl0u3yul2tfgu0...@mail.gmail.com%3E But you can always compress your files in gzip format and they should be good to go. Richin From: ext Connell, Chuck [mailto:c

RE: zip file or tar file cosumption

2012-09-26 Thread Connell, Chuck
Another solution would be Using shell script do following 1. unzip txt files, 2. one by one merge those 50 (or N number of) text files into one text file, 3. then the zip/tar that bigger text file, 4. then that big zip/tar file can be uploaded into hive. Keshav C S

RE: zip file or tar file cosumption

2012-09-26 Thread Connell, Chuck
This could be a problem. Hive uses newline as the record separator. A ZIP file will certainly newline characters. So I doubt this is possible. BUT, I would like to hear from anyone who has solved the "newline is always a record separator" problem, because we ran into it for another type of comp

RE: Hive File Sizes, Merging, and Splits

2012-09-25 Thread Connell, Chuck
ep 25, 2012 at 2:35 PM, Connell, Chuck mailto:chuck.conn...@nuance.com>> wrote: Why do you think the current generated code is inefficient? From: John Omernik [mailto:j...@omernik.com<mailto:j...@omernik.com>] Sent: Tuesday, September 25, 2012 2:57 PM To: user@hive.apache.org<mailto:

RE: Hive File Sizes, Merging, and Splits

2012-09-25 Thread Connell, Chuck
Why do you think the current generated code is inefficient? From: John Omernik [mailto:j...@omernik.com] Sent: Tuesday, September 25, 2012 2:57 PM To: user@hive.apache.org Subject: Hive File Sizes, Merging, and Splits I am really struggling trying to make hears or tails out of how to optimize t

RE: Issue when trying to debug hive

2012-09-22 Thread Connell, Chuck
This question might be better on dev@hive mailing list... From: Kasun Weranga [kas...@wso2.com] Sent: Saturday, September 22, 2012 11:28 AM To: user@hive.apache.org Subject: Issue when trying to debug hive Hi all, I built hive trunk and tried to debug the hive c

RE: Hive custom inputfomat error.

2012-09-20 Thread Connell, Chuck
file or run this command on Hive shell. OR I found this in a previous thread Add following property to your hive-site.xml hive.aux.jars.path file:///home/me/my.jar,file:///home/you/your.jar,file:///home/us/our.jar> Hope this helps. Richin From: ext Connell, Chuck [mailto:chuck.conn...@nua

RE: Hive custom inputfomat error.

2012-09-20 Thread Connell, Chuck
You might try adding “ --auxpath /path/to/jar/dir “ to the Hive command line. Chuck Connell Nuance R&D Data Team Burlington, MA From: Manish [mailto:manishbh...@rocketmail.com] Sent: Thursday, September 20, 2012 11:10 AM To: user Cc: u...@hadoop.apache.org Subject: Hive custom inputfomat error.

RE: warning message while connecting Hive shell

2012-09-20 Thread Connell, Chuck
ly on this > e-mail. If you have received this message in error, please contact the > sender immediately and irrevocably delete this message and any copies. > > > > 2012/9/17 Connell, Chuck > >> I get the same warning. Could you please be more specific about "set

RE: hive json serde

2012-09-18 Thread Connell, Chuck
something like select * from my_serde_table where get_json_object(C, "$.D") ilike "chuck" I like serde table because it is much cleaner than create a table with (value string) and then doing get_json_object or json_tuple and extract all the columns out. I'm e

RE: hive json serde

2012-09-17 Thread Connell, Chuck
5:58 PM To: Connell, Chuck Cc: user@hive.apache.org Subject: Re: hive json serde It works now. Looks like there is a bug in the code. if you do hive --auxpath ./serde then I get an error but if I get the full path as hive --auxpath /var/lib/hdfs/serde/ then get_json_object() works. Thanks for

RE: hive json serde

2012-09-17 Thread Connell, Chuck
I used his pre-built jar. No need to compile anything. Be sure to add " --auxpath /path/to/jar/dir " to the Hive command line. Chuck From: Connell, Chuck [mailto:chuck.conn...@nuance.com] Sent: Monday, September 17, 2012 3:06 PM To: user@hive.apache.org Subject: RE: hive json ser

RE: hive json serde

2012-09-17 Thread Connell, Chuck
I just finished testing this one. No problems found. The developer is also quite responsive to issues raised. I encouraged him to submit it to the Hive dev team as core code. https://github.com/rcongiu/Hive-JSON-Serde/ Chuck Connell Nuance R&D Data Team Burlington, MA From: Mark Golden [mail

RE: warning message while connecting Hive shell

2012-09-17 Thread Connell, Chuck
I get the same warning. Could you please be more specific about "set the class path wrt log4J property file"? Exactly what should we do? Chuck Connell Nuance R&D Data Team Burlington, MA -Original Message- From: ashok.sa...@wipro.com [mailto:ashok.sa...@wipro.com] Sent: Monday, Septemb

RE: How to update and delete a row in hive

2012-09-11 Thread Connell, Chuck
Hive does not support row-level (or field-level) updates. It is designed as a WORM (write once read many) data warehouse. You can of course code your own row updates by reading an entire Hive file, modifying a row, then writing the file back to Hive. Chuck Connell Nuance R&D Data Team Burling

RE: Handling arrays returned by json_tuple ??

2012-09-08 Thread Connell, Chuck
: "jones", "array1" : [{json-object},{json-object}]} I could extract only the top level value array1, but could not "open up" that array to do anything with its embedded elements which are valid json objects! Is this true? Chuck ___

RE: How to load csv data into HIVE

2012-09-08 Thread Connell, Chuck
be faster for such large files. Regards, Mohammad Tariq On Fri, Sep 7, 2012 at 8:27 PM, Connell, Chuck mailto:chuck.conn...@nuance.com>> wrote: I cannot promise which is faster. A lot depends on how clever your scripts are. From: Sandeep Reddy P [mailto:sandeepreddy.3...@gma

Handling arrays returned by json_tuple ??

2012-09-07 Thread Connell, Chuck
I am using the json_tuple lateral view function. It works fine. But I am wondering how to select individual elements from a returned array. Here is an example... $ cat array1.json {"text1" : "smith", "array1" : [6,5,4]} {"text1" : "jones", "array1" : [1,2,3]} {"text1" : "white", "array1" : [9,8

RE: How to load csv data into HIVE

2012-09-07 Thread Connell, Chuck
but when i run that script on a 12GB csv its taking more time. If i run a python script will that be faster? On Fri, Sep 7, 2012 at 10:39 AM, Connell, Chuck mailto:chuck.conn...@nuance.com>> wrote: How about a Python script that changes it into plain tab-separated text? So it would look lik

RE: How to load csv data into HIVE

2012-09-07 Thread Connell, Chuck
How about a Python script that changes it into plain tab-separated text? So it would look like this... 17496927414-mar-20063522876 14-mar-200650308651 etc... Tab-separated with newlines is easy to read and works perfectly on import. Chuck Connell Nuance R&D Data Team Burlington, MA 781-565

RE: Hive directory permissions

2012-08-16 Thread Connell, Chuck
directory permissions We usually start the shell thru sudo,otherwise we get a "Permission denied" while creating Hive tables. But this is a good point, any suggestions/best practices from the user community ? Thanks On Thu, Aug 16, 2012 at 9:37 AM, Connell, Chuck mailto:chuck.conn...@

RE: Hive directory permissions

2012-08-16 Thread Connell, Chuck
I have run into similar problems. Thanks for the suggestions. One concern... Isn't hdfs a highly privileged user within the Hadoop cluster? So do we really want it to be standard practice for all Hive users to su to hdfs? Chuck Connell Nuance R&D Data Team Burlington, MA From: Himanish Kushary

RE: mapper is slower than hive' mapper

2012-08-01 Thread Connell, Chuck
This is actually not surprising. Hive is essentially a MapReduce compiler. It is common for regular compilers (C, C#, Fortran) to emit faster assembler code than you write yourself. Compilers know the tricks of their target language. Chuck Connell Nuance R&D Data Team Burlington, MA -Origi

RE: Data Loaded but Select returns nothing!

2012-07-30 Thread Connell, Chuck
An idea, in addition to what others have said... You are starting with a complex table schema. Start simpler, test it as you go, and then work up to partitions and buckets. That way you make sure you have the basic features working correctly. Chuck Connell Nuance R&D Data Team Burlington, MA

RE: HIVE AND HBASE

2012-07-27 Thread Connell, Chuck
, "Connell, Chuck" wrote: > I recommend the Cloudera CDH release of Hadoop and their auto-install tool. > It saves a lot of config headaches. > > Chuck Connell > Nuance R&D Data Team > Burlington, MA > > > > -Original Message- > From: abhiTowso

RE: Performance Issues in Hive with S3 and Partitions

2012-07-27 Thread Connell, Chuck
What about making your small files bigger, by ZIPping them together? Of course, you have to think about this carefully, so MapReduce can efficiently retrieve the files it needs without unzipping everything every time. Chuck From: richin.j...@nokia.com [mailto:richin.j...@nokia.com] Sent: Frida

RE: HIVE AND HBASE

2012-07-27 Thread Connell, Chuck
I recommend the Cloudera CDH release of Hadoop and their auto-install tool. It saves a lot of config headaches. Chuck Connell Nuance R&D Data Team Burlington, MA -Original Message- From: abhiTowson cal [mailto:abhishek.dod...@gmail.com] Sent: Friday, July 27, 2012 12:31 PM To: user@hi

RE: Creating Hive table by pulling data from mainFrames

2012-07-26 Thread Connell, Chuck
Can you export from DB2 to a plain text tab-separated file? You can certainly import that to Hive. Chuck Connell Nuance R&D Data Team Burlington, MA From: Siddharth Tiwari [mailto:siddharth.tiw...@live.com] Sent: Thursday, July 26, 2012 2:33 PM To: hive user list Subject: Creating Hive table by

RE: Hive 0.9 and Indexing

2012-07-26 Thread Connell, Chuck
I do not have answers to any of your questions, but I appreciate you raising them. My team is very interested in Hive indexing as well, so I look forward to this discussion. Chuck Connell Nuance R&D Data Team Burlington, MA From: John Omernik [mailto:j...@omernik.com] Sent: Thursday, July 26,

RE: Problem replacing existing Hive file with modified copy

2012-07-25 Thread Connell, Chuck
x27;ll file a jira for this issue and update the same here. Regards Bejoy KS From: "Connell, Chuck" mailto:chuck.conn...@nuance.com>> To: "user@hive.apache.org<mailto:user@hive.apache.org>" mailto:user@hive.apache.org>> Sent: We

Problem replacing existing Hive file with modified copy

2012-07-25 Thread Connell, Chuck
I created a Hive table that consists of two files, names1.txt and names2.txt. The table works correctly and answers all queries etc. I want to REPLACE names2.txt with a modified version. I copied the new version of names2.txt to the /tmp/input folder within HDFS. Then I tried the command: hive