lets see the minimal query that shows your problem with some comments about
cardinality of the tables in the join. maybe there could be a crude
workaround using a temp table or some such device if nothing jumps out at
us.
On Thu, Jan 30, 2014 at 4:07 PM, Guy Doulberg wrote:
>
> hi guys
>
> I a
Ok here are the problem(s). Thrift has frame size limits, thrift has to
buffer rows into memory.
Hove thrift has a heap size, it needs to big in this case.
Your client needs a big heap size as well.
The way to do this query if it is possible may be turning row lateral,
potwntially by treating it
hi guys
I am trying to optimize a hive join query, I have a join of two big tables. The
join between them is taking too long, no matter how many reducers I set, there
are always two reducers struggling to finish in the end of the job
The job not always ends, sometime it fails with memory prob
Thanks Roberto. Will try that out.
regards
Sunita
On Thu, Jan 30, 2014 at 10:14 AM, Roberto Congiu
wrote:
> Hi Sunita,
> yes, it's definitely possible and you should use Generic UDFs.
> I wrote one UDF that takes n arrays (each one with the same number of
> elements) and returns an array of str
oh. thinking some more about this i forgot to ask some other basic
questions.
a) what storage format are you using for the table (text, sequence, rcfile,
orc or custom)? "show create table " would yield that.
b) what command is causing the stack trace?
my thinking here is rcfile and orc are co
thanks for the information. Up-to-date hive. Cluster on the smallish side.
And, well, sure looks like a memory issue. :) rather than an inherent hive
limitation that is.
So. I can only speak as a user (ie. not a hive developer) but what i'd be
interested in knowing next is is this via running hi
Hi Sunita,
yes, it's definitely possible and you should use Generic UDFs.
I wrote one UDF that takes n arrays (each one with the same number of
elements) and returns an array of structs which is usually used in a
lateral view.
A good article on how to write a generic UDF is this one:
http://www.ba
Hi,
Looks like you are passing the 'dbType' as derby however the Metastore
connection URL is configured (hive-site.xml) for mysql. Both Hive and
schemaTool will use the metastore URL and driver configured in the
hive-site to connect to the database. If you intend to use derby as
backend, please
Thanks. But if I assign a group of the users to /apps/hive/warehouse then
they can still create internal tables, which is what I am trying to prevent.
I am on version 0.12.0.2.0.6.0.
On Thu, Jan 30, 2014 at 11:55 AM, Peyman Mohajerian wrote:
> This is a known issue, it still will write somethin
This is a known issue, it still will write something at '/apps/hive/warehouse',
it's best to assign a common group to your hive and hdfs users and assign
that group to both of these directories. I heard this issue is fixed in .12
or .13, others can confirm.
On Thu, Jan 30, 2014 at 8:27 AM, Alex N
Hi,
I am trying to enforce all Hive tables to be created with EXTERNAL. The way
I am doing this is by making the location of the warehouse
(/apps/hive/warehouse in my case) to have permissions 000 (completely
inaccessible).
But then when I try to create an external table, I see that it still trie
Can someone please suggest if this is doable or not? Is generic udf the
only option? How would using generic vs simple udf make any difference
since I would be returning the same object either ways.
Thank you
Sunita
-- Forwarded message --
From: *Sunita Arvind*
Date: Wednesday, J
Hi all!
I'm having a performance problem with quering data from hbase using hive. I use
CDH 4.5 (hbase-0.94.6, hive-0.10.0 and hadoop-yarn-2.0.0) on a cluster of 10
hosts. Right now it stores 3 TB of data in hbase table which now consists of
1000+ regions. One record in it looks like this:
hba
We are using the Hive 0.12.0, but it doesn't work better on hive 0.11.0 or
hive 0.10.0
Our hadoop version is 1.1.2.
Our cluster is 1 master + 4 slaves with 1 dual core xeon CPU (with
hyperthreading so 4 cores per machine) + 16Gb Ram each
The error message i get is :
2014-01-29 12:41:09,086 ERROR
Hi Nitin,
Thanks a ton for quick response,
Could you please share if any sql syntax for this
Thanks,
Raj.
On Thu, Jan 30, 2014 at 3:29 PM, Nitin Pawar wrote:
> easiest way to do is .. write it in a temp table and then select uniq of
> each column and writing to real table
>
>
> On Thu, Jan 30
easiest way to do is .. write it in a temp table and then select uniq of
each column and writing to real table
On Thu, Jan 30, 2014 at 3:19 PM, Raj hadoop wrote:
> Hi,
>
> Can someone help me how to delete duplicate records in Hive table,
>
> I know that delete and update are not supported by h
Hi,
Can someone help me how to delete duplicate records in Hive table,
I know that delete and update are not supported by hive but still,
if some know's some alternative can help me in this
Thanks,
Raj.
Hive metastore is creating mysql instead of Derby.
schematool -dbType derby -initSchema
Metastore connection URL:jdbc:mysql://localhost/metastore
Metastore Connection Driver :com.mysql.jdbc.Driver
Metastore connection User: hive
schematool -dbType derby -info
Metastore connection URL:
18 matches
Mail list logo