Figured it out myself by running with ' -hiveconf
hive.root.logger=INFO,console' that it was binlog
---
Hey Guys,
I've been playing with Hive for a while now but somehow I run into this error
all of a sudden when setting up my product
Hey Guys,
I've been playing with Hive for a while now but somehow I run into this error
all of a sudden when setting up my production cluster.
$ hive -e 'LOAD DATA INPATH "/tmp/members_map_2012-06-30.map" OVERWRITE INTO
TABLE members_map_full;'
Loading data to table hyves_goldmine.members_map_
Sorry, can't help you with your specific problem, but incase you're really
stuck;
I used the JSON serde (https://github.com/rcongiu/Hive-JSON-Serde this one is
better then the default one) and it converts nested arrays into maps perfectly.
From: Bhaskar, Snehalata [mailto:snehalata_bhas...@syn
Going to bump this one since I hope to be able to contribute some (worth a bump
:P)
-Original Message-
From: Ruben de Vries [mailto:ruben.devr...@hyves.nl]
Sent: Friday, June 15, 2012 11:59 AM
To: user@hive.apache.org
Subject: Hive-0.8.1 PHP Thrift client broken?
Hey Guys,
I've
Hey Guys,
I've been slamming my head into a wall before on this issue, but now that I'm a
bit more familiar with Hive and Thrift (I got the python version working) I
figured I should try fixing the problem or find out more about it to contribute
some to the project too :)
The php thriftclient
on completion of the job.
Though failed / killed jobs leave data there, which needs to be removed
manually.
Thanks,
Vinod
http://blog.vinodsingh.com/
On Fri, Jun 1, 2012 at 1:58 PM, Ruben de Vries wrote:
Hey Hivers,
I’m almost ready to replace our old hadoop implementation with a
Hey Hivers,
I'm almost ready to replace our old hadoop implementation with a implementation
using Hive,
Now I've ran into (hopefully) my last problem; my /tmp/hive-hduser dir is
getting kinda big!
It doesn't seem to cleanup this tmp files, googling for it I run into some
tickets about a cleanu
Partitioning can greatly increase performance for WHERE clauses since hive can
omit parsing the data in the partitions which do no meet the requirement.
For example if you partition by date (I do it by INT dateint, in which case I
set dateint to be MMDD) and you do WHERE dateint >= 20120101 t
Hey,
We use hadoop/hive for processing our access logs and we run a daily cronjob
(python script) which does the parsing jobs and some partitioning etc.
The results from those jobs are then queried on by other jobs which generate
the results the management team wants to see :-)
From: Ronak Bh
.com/2499658
I've also created a ticket in JIRA but it doesn't seem to get any attention at
all: https://issues.apache.org/jira/browse/HIVE-2992
Greetz, Ruben de Vries
I really do feel like this isn't as intended, should I make a ticket in JIRA?
-Original Message-
From: Ruben de Vries [mailto:ruben.devr...@hyves.nl]
Sent: Thursday, April 26, 2012 3:37 PM
To: user@hive.apache.org
Subject: RE: JOIN + LATERAL VIEW + MAPJOIN = no output?!
https://gist.github.com/2499658
and this is the plan.xml its using
-Original Message-
From: Ruben de Vries [mailto:ruben.devr...@hyves.nl]
Sent: Thursday, April 26, 2012 3:17 PM
To: user@hive.apache.org
Subject: JOIN + LATERAL VIEW + MAPJOIN = no output?!
Okay first off; so JOIN
Okay first off; so JOIN + LATERAL VIEW together isn't working so I moved my
JOIN into a subquery and that makes the query work properly
However when I added a MAPJOIN hint for the JOIN in the subquery it will also
stop doing the reducer for the main query!
This only happens when there's a LATERA
rom 350sec to 110sec when being able to
MAPJOIN(), gotta love that speed if it works!
-Original Message-
From: Ruben de Vries [mailto:ruben.devr...@hyves.nl]
Sent: Thursday, April 26, 2012 9:16 AM
To: user@hive.apache.org; gemini5201...@gmail.com; mgro...@oanda.com
Subject: RE: When/how t
NDA Corporation
www: oanda.com www: fxtrade.com
e: mgro...@oanda.com
"Best Trading Platform" - World Finance's Forex Awards 2009.
"The One to Watch" - Treasury Today's Adam Smith Awards 2009.
- Original Message -
From: "Ruben de Vries"
To: use
y 25M for small table, copy your hive-default.xml to hive-site.xml and
set hive.mapjoin.smalltable.filesize=3
在 2012年4月25日 上午12:09,Ruben de Vries 写道:
I got the (rather big) log here in a github gist:
https://gist.github.com/2480893
And I also attached the plan.xml it was using to the gist.
mgro...@oanda.com
"Best Trading Platform" - World Finance's Forex Awards 2009.
"The One to Watch" - Treasury Today's Adam Smith Awards 2009.
- Original Message -
From: "Ruben de Vries"
To: user@hive.apache.org
Sent: Monday, April 23, 2012 9:08:16 AM
S
ogging and post in the console to get a better picture
why it consumes this much memory.
Start your hive shell as
hive -hiveconf hive.root.logger=ALL,console;
Regards
Bejoy KS
____
From: Ruben de Vries
To: "user@hive.apache.org"
Sent: Tuesday, April 24,
splitting to mappers is out of question.
can you do a dfs count for the members_map table hdfslocation and tell us the
result?
On Tue, Apr 24, 2012 at 2:06 PM, Ruben de Vries wrote:
Hmm I must be doing something wrong, the members_map table is 300ish MB.
When I execute the following query:
S
46 PM
Subject: Re: When/how to use partitions and buckets usefully?
If you are doing a map side join make sure the table members_map is small
enough to hold in memory
On 4/24/12, Ruben de Vries wrote:
> Wow thanks everyone for the nice feedback!
>
> I can force a mapside join by do
Wow thanks everyone for the nice feedback!
I can force a mapside join by doing /*+ STREAMTABLE(members_map) */ right?
Cheers,
Ruben de Vries
-Original Message-
From: Mark Grover [mailto:mgro...@oanda.com]
Sent: Tuesday, April 24, 2012 3:17 AM
To: user@hive.apache.org; bejoy ks
Cc
joins would offer you much performance improvement.
Regards
Bejoy KS
Sent from handheld, please excuse typos.
________
From: Ruben de Vries mailto:ruben.devr...@hyves.nl>>
Date: Mon, 23 Apr 2012 17:38:20 +0200
To:
user@hive.apache.orgmailto:user@hive.apache.org%3cu...@hiv
avoid having as many rows from
visit_stats compared to each member_id for joins.
Matt Tucker
From: Ruben de Vries
[mailto:ruben.devr...@hyves.nl]<mailto:[mailto:ruben.devr...@hyves.nl]>
Sent: Monday, April 23, 2012 11:19 AM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject:
It seems there's enough information to be found on how to setup and use
partitions and buckets.
But I'm more interested in how to figure out when and what columns you should
be partitioning and bucketing to increase performance?!
In my case I got 2 tables, 1 visit_stats (member_id, date and some
It's a bit of a weird case but I thought I might share it and hopefully find
someone who can confirm this to be a bug or tell me I should do things
differently!
Here you can find a pastie with the full create and select queries:
http://pastie.org/3838924
I've got two tables:
`visit_stats` with
ORMAT 'com.mycompany.SequenceFileKeyInputFormat'
Dilip
On Thu, Apr 19, 2012 at 6:09 AM, Owen O'Malley
mailto:omal...@apache.org>> wrote:
On Thu, Apr 19, 2012 at 3:07 AM, Ruben de Vries
mailto:ruben.devr...@hyves.nl>> wrote:
> I'm trying to migrate a part of our current hadoop j
d there's
already code for working with Avro in MR as input.)
On Apr 19, 2012, at 6:15 AM, madhu phatak wrote:
Serde will allow you to create custom data from your sequence File
https://cwiki.apache.org/confluence/display/Hive/SerDe
On Thu, Apr 19, 2012 at 3:37 PM, Ruben de Vries
mai
sequence File
https://cwiki.apache.org/confluence/display/Hive/SerDe
On Thu, Apr 19, 2012 at 3:37 PM, Ruben de Vries
mailto:ruben.devr...@hyves.nl>> wrote:
I'm trying to migrate a part of our current hadoop jobs from normal mapreduce
jobs to hive,
Previously the data was stored in se
omSeqRecordReader extends
SequenceFileRecordReader implements RecordReader {
Hope some1 has a snippet or can help me out, would really love to be able to
switch part of our jobs to hive,
Ruben de Vries
29 matches
Mail list logo