I am not sure about speed, but my understanding is that tables are real things,
they exist as files on disk and can be reused. Views are temporary entities
that are created on-the-fly and cannot be reused later.
Chuck
From: Peter Chu [mailto:pete@outlook.com]
Sent: Monday, May 06, 2013 2:4
When you do SELECT *, Hive does not run a real MapReduce job, so it is not a
good test. Something is wrong with SerDe or InputFormat.
Chuck
From: Thomas, Matthew [mailto:mtho...@verisign.com]
Sent: Sunday, April 07, 2013 5:41 PM
To: user@hive.apache.org
Subject: Hive Problems Reading Avro+Snapp
Using the Hive sampling feature would also help. This is exactly what that
feature is designed for.
Chuck
From: Kyle B [mailto:kbi...@gmail.com]
Sent: Tuesday, March 05, 2013 1:45 PM
To: user@hive.apache.org
Subject: Hive sample test
Hello,
I was wondering if there is a way to quick-verify a
+1, great idea!
Chuck Connell, Nuance
From: Robin Morris [mailto:r...@baynote.com]
Sent: Thursday, February 14, 2013 1:59 AM
To: user@hive.apache.org
Subject: CHAN (Comprehensive Hive Archive Network) (Was Re: Using Reflect: A
thread for ideas)
I think we need to think a little bigger than this
So you have an XML serde? Did you write it? How can others download it?
Chuck
From: Mohit Chaudhary01 [mohit_chaudhar...@infosys.com]
Sent: Friday, December 21, 2012 6:39 AM
To: user@hive.apache.org
Subject: RE: reg : getting table values in inputFormat in serde
I don't think you can do this. Populating new columns is the same as "row level
updates" which Hive does not do. AFAIK, your only option is to write a new
table, by reading the old table, selecting all of it, appending new values to
each row, then writing the longer rows to a new table.
Chuck
it to STDOUT. My Insert statement took the results of the pcap
parsing script (including the hexed data) and then unhexed it at insert. There
may be a better way to do this, but for me it works well. *shrug*
On Sun, Dec 2, 2012 at 9:00 AM, Connell, Chuck
mailto:chuck.conn...@nuance.com>
insert
>statement. That allowed me to move the data with newline around easily, but on
>the final step (on insert) it would unhex it and put it in as actual binary,
>no bytes were harmed in the hexing (or unhexing) of my data.
On Sat, Dec 1, 2012 at 4:11 PM, Connell, Chuck
mail
ll.
On Sat, Dec 1, 2012 at 10:50 AM, Connell, Chuck
mailto:chuck.conn...@nuance.com>> wrote:
I am trying to use BINARY columns and believe I have the perfect use-case for
it, but I am missing something. Has anyone used this for true binary data
(which may contain newlines)?
Here is the
I am trying to use BINARY columns and believe I have the perfect use-case for
it, but I am missing something. Has anyone used this for true binary data
(which may contain newlines)?
Here is the background... I have some files that each contain just one logical
field, which is a binary object.
I meant PLAIN tab-separated text.
From: Connell, Chuck [mailto:chuck.conn...@nuance.com]
Sent: Thursday, November 29, 2012 9:51 AM
To: user@hive.apache.org
Subject: RE: Starting with Hive - writing custom SerDe
You might save yourself a lot of work by pre-processing the data, before
putting it
You might save yourself a lot of work by pre-processing the data, before
putting it into Hive. A Python script should be able to find all the fields,
and change the data to plan tab-separated text. This will load directly into
Hive, and removes the need to write a custom SerDe.
Chuck Connell
Nu
Plain old tab-separated text is very easy to import to Hive. So I would
pre-process the CSV you have now to change it to tab-separated (with no quotes).
Chuck Connell
Nuance R&D Data Team
Burlington, MA
From: Mark Grover [mailto:grover.markgro...@gmail.com]
Sent: Wednesday, November 28, 2012 11:
I believe there are some free Cloudera discussion boards/lists. Standard
Cloudera support is a paid service.
Chuck Connell
Nuance R&D Data Team
Burlington, MA
From: Sadananda Hegde [mailto:saduhe...@gmail.com]
Sent: Monday, November 26, 2012 2:42 PM
To: user@hive.apache.org
Subject: Re: Setting
The Hive query language (HiveQL) is completely different from Pig.
Chuck Connell
Nuance R&D Data Team
From: Majid Azimi [mailto:majid.merk...@gmail.com]
Sent: Friday, November 16, 2012 4:27 AM
To: user@hive.apache.org
Subject: Do I need any Pig knowledge to learn Hive?
hi guys,
Do I need any P
easily 10 patches post hive 0.9.0X that are for
making things that only worked on linux work natively on windows.
Baby steps.
Edward
On Tue, Nov 13, 2012 at 4:44 PM, Connell, Chuck
wrote:
> This same question comes up about once per week on this list. Anyone who
> knows how to type &quo
This same question comes up about once per week on this list. Anyone who knows
how to type "hive cygwin" into a search bar will pretty quickly find that
Hadoop is made for native Linux.
Is it possible to make Hadoop (and Hive) run under Cygwin? Apparently so, with
lots of kludges and jumping t
ndows
From: "Connell, Chuck"
To: "user@hive.apache.org" ; "v.balakrish...@tcs.com"
Sent: Monday, 29 October 2012 6:59 PM
Subject: RE: Not able to run queries in Hive
Cygwin is not Linux. It does not run native Linux code.
__
Cygwin is not Linux. It does not run native Linux code.
From: yogesh.kuma...@wipro.com [yogesh.kuma...@wipro.com]
Sent: Monday, October 29, 2012 9:20 AM
To: user@hive.apache.org; v.balakrish...@tcs.com
Subject: RE: Not able to run queries in Hive
Bala,
Then I t
I would just install Cloudera CDH4 and you don't have to do any config at all,
super easy.
Chuck
From: Artem Ervits [are9...@nyp.org]
Sent: Wednesday, October 24, 2012 4:41 PM
To: user@hive.apache.org
Subject: RE: Hive installation
Whew, sorry for all the spam,
A serde is actually used the other way around... Hive parses the query, writes
MapReduce code to solve the query, and the generated code uses the serde for
field access.
Standard way to write a serde is to start from the trunk regex serde, then
modify as needed...
http://svn.apache.org/viewvc/
your help.
Sadu
On Fri, Oct 12, 2012 at 9:17 AM, Connell, Chuck
mailto:chuck.conn...@nuance.com>> wrote:
Sadu,
I am using JSON as the input format, with the JSON SerDe from
https://github.com/rcongiu/Hive-JSON-Serde.
A sample JSON record is: (in actual use each JSON record must be on one
er?
Thanks for your help.
Sadu
On Fri, Oct 12, 2012 at 9:17 AM, Connell, Chuck
mailto:chuck.conn...@nuance.com>> wrote:
Sadu,
I am using JSON as the input format, with the JSON SerDe from
https://github.com/rcongiu/Hive-JSON-Serde.
A sample JSON record is: (in actual use each JSON record
ail.com]
Sent: Thursday, October 11, 2012 11:47 PM
To: user@hive.apache.org
Subject: Re: Any advice about complex Hive tables?
Hi Chuck,
I have a similar complex hive tables with many fields and some are nested like
array of structs (but only upto 3 levels). How did you define you ROW FORMAT
since everything else about Hive is just what we want.
Any suggestions? Workarounds?
Thanks,
Chuck
From: Connell, Chuck
Sent: Thursday, October 04, 2012 4:31 PM
To: user@hive.apache.org
Subject: RE: Limit to columns or nesting of Hive table?
The issue appar
-Original Message-
From: Connell, Chuck [mailto:chuck.conn...@nuance.com]
Sent: Thursday, October 04, 2012 12:09 PM
To: user@hive.apache.org
Subject: RE: Limit to columns or nesting of Hive table?
Thanks. So is the nesting limit 10 now? Does your 2nd paragraph mean that this
limit cannot easily
s can case issues.
Edward
On Thu, Oct 4, 2012 at 11:48 AM, Connell, Chuck
wrote:
> I am trying to create a large Hive table, with many columns and deeply
> nested structs. It is failing with java.lang.ArrayIndexOutOfBoundsException:
> 10.
>
>
>
> Before I spend a lot of time
I am trying to create a large Hive table, with many columns and deeply nested
structs. It is failing with java.lang.ArrayIndexOutOfBoundsException: 10.
Before I spend a lot of time debugging my table declaration, is there some
limit here I should know about? Max number of columns? Max depth of s
o be a
> more explicit declaration to be made but I'm giving up if I can't resolve
> this today.
>
> Anthony
>
>
> On Tue, Oct 2, 2012 at 10:26 AM, Connell, Chuck
> mailto:chuck.conn...@nuance.com>>
> wrote:
>>
>> Try the easy way... Cloude
Try the easy way... Cloudera CDH4 running on Centos 5.8. Can install everything
on one machine.
Chuck
From: Anthony Ikeda [mailto:anthony.ikeda@gmail.com]
Sent: Tuesday, October 02, 2012 1:23 PM
To: user@hive.apache.org
Subject: Hive does not run - Typical NoSuchFieldError
I've tried diffe
Yeah, seems like you can’t do that
see:http://mail-archives.apache.org/mod_mbox/hive-user/201203.mbox/%3CCAENxBwxkF--3PzCkpz1HX21=gb9yvasr2jl0u3yul2tfgu0...@mail.gmail.com%3E
But you can always compress your files in gzip format and they should be good
to go.
Richin
From: ext Connell, Chuck [mailto:c
Another solution would be
Using shell script do following
1. unzip txt files,
2. one by one merge those 50 (or N number of) text files into one text
file,
3. then the zip/tar that bigger text file,
4. then that big zip/tar file can be uploaded into hive.
Keshav C S
This could be a problem. Hive uses newline as the record separator. A ZIP file
will certainly newline characters. So I doubt this is possible.
BUT, I would like to hear from anyone who has solved the "newline is always a
record separator" problem, because we ran into it for another type of
comp
ep 25, 2012 at 2:35 PM, Connell, Chuck
mailto:chuck.conn...@nuance.com>> wrote:
Why do you think the current generated code is inefficient?
From: John Omernik [mailto:j...@omernik.com<mailto:j...@omernik.com>]
Sent: Tuesday, September 25, 2012 2:57 PM
To: user@hive.apache.org<mailto:
Why do you think the current generated code is inefficient?
From: John Omernik [mailto:j...@omernik.com]
Sent: Tuesday, September 25, 2012 2:57 PM
To: user@hive.apache.org
Subject: Hive File Sizes, Merging, and Splits
I am really struggling trying to make hears or tails out of how to optimize t
This question might be better on dev@hive mailing list...
From: Kasun Weranga [kas...@wso2.com]
Sent: Saturday, September 22, 2012 11:28 AM
To: user@hive.apache.org
Subject: Issue when trying to debug hive
Hi all,
I built hive trunk and tried to debug the hive c
file or run
this command on Hive shell.
OR
I found this in a previous thread
Add following property to your hive-site.xml
hive.aux.jars.path
file:///home/me/my.jar,file:///home/you/your.jar,file:///home/us/our.jar>
Hope this helps.
Richin
From: ext Connell, Chuck [mailto:chuck.conn...@nua
You might try adding “ --auxpath /path/to/jar/dir “ to the Hive command line.
Chuck Connell
Nuance R&D Data Team
Burlington, MA
From: Manish [mailto:manishbh...@rocketmail.com]
Sent: Thursday, September 20, 2012 11:10 AM
To: user
Cc: u...@hadoop.apache.org
Subject: Hive custom inputfomat error.
ly on this
> e-mail. If you have received this message in error, please contact the
> sender immediately and irrevocably delete this message and any copies.
>
>
>
> 2012/9/17 Connell, Chuck
>
>> I get the same warning. Could you please be more specific about "set
something like
select *
from my_serde_table
where get_json_object(C, "$.D") ilike "chuck"
I like serde table because it is much cleaner than create a table with (value
string) and then doing get_json_object or json_tuple and extract all the
columns out.
I'm e
5:58 PM
To: Connell, Chuck
Cc: user@hive.apache.org
Subject: Re: hive json serde
It works now. Looks like there is a bug in the code.
if you do hive --auxpath ./serde then I get an error but if I get the full path
as
hive --auxpath /var/lib/hdfs/serde/ then get_json_object() works.
Thanks for
I used his pre-built jar. No need to compile anything.
Be sure to add " --auxpath /path/to/jar/dir " to the Hive command line.
Chuck
From: Connell, Chuck [mailto:chuck.conn...@nuance.com]
Sent: Monday, September 17, 2012 3:06 PM
To: user@hive.apache.org
Subject: RE: hive json ser
I just finished testing this one. No problems found. The developer is also
quite responsive to issues raised. I encouraged him to submit it to the Hive
dev team as core code.
https://github.com/rcongiu/Hive-JSON-Serde/
Chuck Connell
Nuance R&D Data Team
Burlington, MA
From: Mark Golden [mail
I get the same warning. Could you please be more specific about "set the class
path wrt log4J property file"? Exactly what should we do?
Chuck Connell
Nuance R&D Data Team
Burlington, MA
-Original Message-
From: ashok.sa...@wipro.com [mailto:ashok.sa...@wipro.com]
Sent: Monday, Septemb
Hive does not support row-level (or field-level) updates. It is designed as a
WORM (write once read many) data warehouse.
You can of course code your own row updates by reading an entire Hive file,
modifying a row, then writing the file back to Hive.
Chuck Connell
Nuance R&D Data Team
Burling
: "jones", "array1" : [{json-object},{json-object}]}
I could extract only the top level value array1, but could not "open up" that
array to do anything with its embedded elements which are valid json objects!
Is this true?
Chuck
___
be faster for
such large files.
Regards,
Mohammad Tariq
On Fri, Sep 7, 2012 at 8:27 PM, Connell, Chuck
mailto:chuck.conn...@nuance.com>> wrote:
I cannot promise which is faster. A lot depends on how clever your scripts are.
From: Sandeep Reddy P
[mailto:sandeepreddy.3...@gma
I am using the json_tuple lateral view function. It works fine. But I am
wondering how to select individual elements from a returned array.
Here is an example...
$ cat array1.json
{"text1" : "smith", "array1" : [6,5,4]}
{"text1" : "jones", "array1" : [1,2,3]}
{"text1" : "white", "array1" : [9,8
but when i run that script on a 12GB csv
its taking more time. If i run a python script will that be faster?
On Fri, Sep 7, 2012 at 10:39 AM, Connell, Chuck
mailto:chuck.conn...@nuance.com>> wrote:
How about a Python script that changes it into plain tab-separated text? So it
would look lik
How about a Python script that changes it into plain tab-separated text? So it
would look like this...
17496927414-mar-20063522876
14-mar-200650308651
etc...
Tab-separated with newlines is easy to read and works perfectly on import.
Chuck Connell
Nuance R&D Data Team
Burlington, MA
781-565
directory permissions
We usually start the shell thru sudo,otherwise we get a "Permission denied"
while creating Hive tables.
But this is a good point, any suggestions/best practices from the user
community ?
Thanks
On Thu, Aug 16, 2012 at 9:37 AM, Connell, Chuck
mailto:chuck.conn...@
I have run into similar problems. Thanks for the suggestions. One concern...
Isn't hdfs a highly privileged user within the Hadoop cluster? So do we really
want it to be standard practice for all Hive users to su to hdfs?
Chuck Connell
Nuance R&D Data Team
Burlington, MA
From: Himanish Kushary
This is actually not surprising. Hive is essentially a MapReduce compiler. It
is common for regular compilers (C, C#, Fortran) to emit faster assembler code
than you write yourself. Compilers know the tricks of their target language.
Chuck Connell
Nuance R&D Data Team
Burlington, MA
-Origi
An idea, in addition to what others have said... You are starting with a
complex table schema. Start simpler, test it as you go, and then work up to
partitions and buckets. That way you make sure you have the basic features
working correctly.
Chuck Connell
Nuance R&D Data Team
Burlington, MA
, "Connell, Chuck" wrote:
> I recommend the Cloudera CDH release of Hadoop and their auto-install tool.
> It saves a lot of config headaches.
>
> Chuck Connell
> Nuance R&D Data Team
> Burlington, MA
>
>
>
> -Original Message-
> From: abhiTowso
What about making your small files bigger, by ZIPping them together? Of course,
you have to think about this carefully, so MapReduce can efficiently retrieve
the files it needs without unzipping everything every time.
Chuck
From: richin.j...@nokia.com [mailto:richin.j...@nokia.com]
Sent: Frida
I recommend the Cloudera CDH release of Hadoop and their auto-install tool. It
saves a lot of config headaches.
Chuck Connell
Nuance R&D Data Team
Burlington, MA
-Original Message-
From: abhiTowson cal [mailto:abhishek.dod...@gmail.com]
Sent: Friday, July 27, 2012 12:31 PM
To: user@hi
Can you export from DB2 to a plain text tab-separated file? You can certainly
import that to Hive.
Chuck Connell
Nuance R&D Data Team
Burlington, MA
From: Siddharth Tiwari [mailto:siddharth.tiw...@live.com]
Sent: Thursday, July 26, 2012 2:33 PM
To: hive user list
Subject: Creating Hive table by
I do not have answers to any of your questions, but I appreciate you raising
them. My team is very interested in Hive indexing as well, so I look forward to
this discussion.
Chuck Connell
Nuance R&D Data Team
Burlington, MA
From: John Omernik [mailto:j...@omernik.com]
Sent: Thursday, July 26,
x27;ll file a jira for this issue and update the same here.
Regards
Bejoy KS
From: "Connell, Chuck"
mailto:chuck.conn...@nuance.com>>
To: "user@hive.apache.org<mailto:user@hive.apache.org>"
mailto:user@hive.apache.org>>
Sent: We
I created a Hive table that consists of two files, names1.txt and names2.txt.
The table works correctly and answers all queries etc.
I want to REPLACE names2.txt with a modified version. I copied the new version
of names2.txt to the /tmp/input folder within HDFS. Then I tried the command:
hive
61 matches
Mail list logo