Re: how to control hive log location on 0.13?

2014-07-18 Thread Andre Araujo
Hi, Yang, you're running your mapreduce jobs in Hadoop's local mode, and in that mode all the Hive MR logging is handled through log4j on your local machine, which is what this log file is about. The log location and naming is controlled by the property log4j.appender.FA.File in the Hive log4j pr

random NPE in HiveInputFormat.init() ??

2014-07-18 Thread Yang
we are getting a random (happening about 20% of the time, if we repeatedly run the same query) error with hive 0.13.0.2 java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:255) at org.apache.hadoop.hive.ql.io.HiveInputFormat.get

Re: how to control hive log location on 0.13?

2014-07-18 Thread Yang
it's in /tmp/my_user/ the funny thing is that I already have a hive.log there. On Fri, Jul 18, 2014 at 6:01 PM, Andre Araujo wrote: > and where is it located? > > > On 19 July 2014 10:58, Andre Araujo wrote: > >> Can you give us an excerpt of the contents of this log? >> >> >> On 19 July 201

Re: how to control hive log location on 0.13?

2014-07-18 Thread Yang
2014-07-18 15:03:37,774 INFO mr.ExecDriver (SessionState.java:printInfo(537)) - Execution log at: /tmp/myuser/myuser_2014071815030 3_56bf6bb0-db30-4dbc-807c-9023ce4103f4.log 2014-07-18 15:03:37,864 WARN conf.Configuration (Configuration.java:loadProperty(2358)) - file:/tmp/myuser/hive_2014-07-18_

Re: Hive huge 'startup time'

2014-07-18 Thread Db-Blog
Hello everyone, Thanks for sharing valuable inputs. I am working on similar kind of task, it will be really helpful if you can share the command for increasing the heap size of hive-cli/launching process. Thanks, Saurabh Sent from my iPhone, please avoid typos. > On 18-Jul-2014, at 8:23 pm

Re: how to control hive log location on 0.13?

2014-07-18 Thread Andre Araujo
and where is it located? On 19 July 2014 10:58, Andre Araujo wrote: > Can you give us an excerpt of the contents of this log? > > > On 19 July 2014 04:38, Yang wrote: > >> thanks guys. anybody knows what generates the log like " >> myuser_20140716143232_d76043ed-1c4b-42a0-bf0a-2816377a6a2a.l

Re: how to control hive log location on 0.13?

2014-07-18 Thread Andre Araujo
Can you give us an excerpt of the contents of this log? On 19 July 2014 04:38, Yang wrote: > thanks guys. anybody knows what generates the log like " > myuser_20140716143232_d76043ed-1c4b-42a0-bf0a-2816377a6a2a.log" ? I > checked our application code, it doesn't generate this, looks from hive

Re: how to control hive log location on 0.13?

2014-07-18 Thread Lefty Leverenz
Thanks André, I've added the sticky bit advice to Error Logs . -- Lefty On Fri, Jul 18, 2014 at 2:38 PM, Yang wrote: > thanks guys. anybody knows what generates the log like " > myuser_20140716143232_

Re: how to control hive log location on 0.13?

2014-07-18 Thread Yang
thanks guys. anybody knows what generates the log like " myuser_20140716143232_d76043ed-1c4b-42a0-bf0a-2816377a6a2a.log" ? I checked our application code, it doesn't generate this, looks from hive. On Fri, Jul 18, 2014 at 12:28 AM, Andre Araujo wrote: > Make sure the directory you specify has

Hive support for filtering Unicode data

2014-07-18 Thread Duc le anh
Hello Hive, I posted the below question on Stackoverflow

RE: Hive Join Running Out of Memory

2014-07-18 Thread Clay McDonald
I changed the hive.auto.convert.join.noconditionaltask = false in the hive site and that seemed to do the trick. Thanks! From: Edward Capriolo [mailto:edlinuxg...@gmail.com] Sent: Friday, July 18, 2014 10:57 AM To: user@hive.apache.org Subject: Re: Hive Join Running Out of Memory I believe th

Re: Hive Join Running Out of Memory

2014-07-18 Thread Edward Capriolo
I believe that would be the one. On Fri, Jul 18, 2014 at 10:54 AM, Clay McDonald < stuart.mcdon...@bateswhite.com> wrote: > Thank you. Would it be acceptable to use the following? > > SET hive.exec.mode.local.auto=false; > > > From: Edward Capriolo [mailto:edlinuxg...@gmail.com] > Sent: Friday,

RE: Hive Join Running Out of Memory

2014-07-18 Thread Clay McDonald
Thank you. Would it be acceptable to use the following? SET hive.exec.mode.local.auto=false; From: Edward Capriolo [mailto:edlinuxg...@gmail.com] Sent: Friday, July 18, 2014 10:45 AM To: user@hive.apache.org Subject: Re: Hive Join Running Out of Memory This is a failed optimization hive is try

Re: Hive huge 'startup time'

2014-07-18 Thread Edward Capriolo
Unleash ze file crusha! https://github.com/edwardcapriolo/filecrush On Fri, Jul 18, 2014 at 10:51 AM, diogo wrote: > Sweet, great answers, thanks. > > Indeed, I have a small number of partitions, but lots of small files, > ~20MB each. I'll make sure to combine them. Also, increasing the heap s

Re: Hive huge 'startup time'

2014-07-18 Thread diogo
Sweet, great answers, thanks. Indeed, I have a small number of partitions, but lots of small files, ~20MB each. I'll make sure to combine them. Also, increasing the heap size of the cli process already helped speed it up. Thanks, again. On Fri, Jul 18, 2014 at 10:26 AM, Edward Capriolo wrote:

Re: Hive Join Running Out of Memory

2014-07-18 Thread Edward Capriolo
This is a failed optimization hive is trying to build the lookup table locally and then put it in the distributed cache and then to a map join. Look through your hive site for the configuration to turn these auto-map joins off. Based on your version the variables changed a names /deprecated etc so

Hive Join Running Out of Memory

2014-07-18 Thread Clay McDonald
Hello everyone. I need some assistance. I have a join that fails with return code 3. The query is; SELECT B.CARD_NBR AS CNT FROM TENDER_TABLE A JOIN LOYALTY_CARDS B ON A.CARD_NBR = B.CARD_NBR LIMIT 10; -- Row Counts -- LOYALTY_CARDS = 43,876,938 -- TENDER_TABLE = 1,412,228,333 The query exe

Re: Hive huge 'startup time'

2014-07-18 Thread Edward Capriolo
The planning phase needs to do work for every hive partition and every hadoop files. If you have a lot of 'small' files or many partitions this can take a long time. Also the planning phase that happens on the job tracker is single threaded. Also the new yarn stuff requires back and forth to alloca

Re: Hive huge 'startup time'

2014-07-18 Thread Prem Yadav
may be you can post your partition structure and the query..Over partitioning data is one of the reasons it happens. On Fri, Jul 18, 2014 at 2:36 PM, diogo wrote: > This is probably a simple question, but I'm noticing that for queries that > run on 1+TB of data, it can take Hive up to 30 minute

Hive huge 'startup time'

2014-07-18 Thread diogo
This is probably a simple question, but I'm noticing that for queries that run on 1+TB of data, it can take Hive up to 30 minutes to actually start the first map-reduce stage. What is it doing? I imagine it's gathering information about the data somehow, this 'startup' time is clearly a function of

ERROR in JDBC

2014-07-18 Thread CHEBARO Abdallah
Hello Hive Community, I am trying to run the JDBC (from cwiki.apache.org), using HiveServer2. Everything in the Java code (attached above) runs well except for the last query: sql = "select * from " + tableName; Attached is the complete log file of several runs. I have noticed the following er

Re: how to control hive log location on 0.13?

2014-07-18 Thread Andre Araujo
Make sure the directory you specify has the sticky bit set, otherwise users will have permission problems: chmod 1777 On 18 July 2014 14:19, Satish Mittal wrote: > You can configure the following property in > $HIVE_HOME/conf/hive-log4j.properties: > > hive.log.dir= > > The default value of t