unsubscribe

2015-09-02 Thread Sasha Ostrikov
unsubscribe

Re: Disabling local mode optimization

2015-09-02 Thread Daniel Haviv
Excatly the info I needed. Thanks Daniel > On 3 בספט׳ 2015, at 09:02, sreebalineni . wrote: > > Hi, > > Is not it that you should set it true, by default it is disabled which is > false. > Hive analyzes the size of each map-reduce job in a query and may run it > locally if the following thre

Re: Disabling local mode optimization

2015-09-02 Thread sreebalineni .
Hi, Is not it that you should set it true, by default it is disabled which is false. Hive analyzes the size of each map-reduce job in a query and may run it locally if the following thresholds are satisfied: - The total input size of the job is lower than: hive.exec.mode.local.auto.inputby

Re: ORC NPE while writing stats

2015-09-02 Thread David Capwell
Also, if I am walking this correctly writer.addRow(struct) may trigger my current thread to flush all the state for other writers running in different threads. This state isn't updated by the same lock, so my thread won't see the same state, which would explain the NPE. Another issue is that est

Re: ORC NPE while writing stats

2015-09-02 Thread David Capwell
Walking the MemoryManager, and I have a few questions: # statements Every time you create a writer for a given thread (assuming the thread local version), you just update MemoryManager with the stripe size. The scale is just %heap / (#writer * stripe (assuming equal stripe size)). Periodically O

Re: ORC NPE while writing stats

2015-09-02 Thread David Capwell
So, very quickly looked at the JIRA and I had the following question; if you have a pool per thread rather than global, then assuming 50% heap will cause writer to OOM with multiple threads, which is different than older (0.14) ORC, correct? https://github.com/apache/hive/blob/master/ql/src/java/o

Re: ORC NPE while writing stats

2015-09-02 Thread David Capwell
Thanks for the jira, will see if that works for us. On Sep 2, 2015 7:11 PM, "Prasanth Jayachandran" < pjayachand...@hortonworks.com> wrote: > Memory manager is made thread local > https://issues.apache.org/jira/browse/HIVE-10191 > > Can you try the patch from HIVE-10191 and see if that helps? > >

Re: ORC NPE while writing stats

2015-09-02 Thread Prasanth Jayachandran
Memory manager is made thread local https://issues.apache.org/jira/browse/HIVE-10191 Can you try the patch from HIVE-10191 and see if that helps? On Sep 2, 2015, at 8:58 PM, David Capwell mailto:dcapw...@gmail.com>> wrote: I'll try that out and see if it goes away (not seen this in the past 24

Re: ORC NPE while writing stats

2015-09-02 Thread David Capwell
I'll try that out and see if it goes away (not seen this in the past 24 hours, no code change). Doing this now means that I can't share the memory, so will prob go with a thread local and allocate fixed sizes to the pool per thread (50% heap / 50 threads). Will most likely be awhile before I can

Re: Wrong results from join query in Hive 0.13 and also 1.0 with reproduce.

2015-09-02 Thread Jim Green
Thanks Ashutosh. Actually for this kind of query, if I put the 2 filters in WHERE clause instead of ON clause, the query result is correct. Do you suggest we put all filters into WHERE or OR clause? And Why? On Wed, Sep 2, 2015 at 3:13 PM, Ashutosh Chauhan wrote: > It indeed is. Title of bug is

Re: Wrong results from join query in Hive 0.13 and also 1.0 with reproduce.

2015-09-02 Thread Ashutosh Chauhan
It indeed is. Title of bug is symptom of problem and doesn't accurately describe the problem. Bug will be triggered if following conditions are met: If query contains 3 or more joins AND joins are merged (i.e. tables participating in two of those joins are joined on same keys) AND these merged joi

Request for write access to the Hive wiki

2015-09-02 Thread Aswathy C.S
Hi, I would like to get write access to Hive will. My Confluence username: asreekumar. thanks Aswathy

Re: ORC NPE while writing stats

2015-09-02 Thread Owen O'Malley
(Dropping dev) Well, that explains the non-determinism, because the MemoryManager will be shared across threads and thus the stripes will get flushed at effectively random times. Can you try giving each writer a unique MemoryManager? You'll need to put a class into the org.apache.hadoop.hive.ql.i

Re: ORC NPE while writing stats

2015-09-02 Thread David Capwell
Also, the data put in are primitives, structs (list), and arrays (list); we don't use any of the boxed writables (like text). On Sep 2, 2015 12:57 PM, "David Capwell" wrote: > We have multiple threads writing, but each thread works on one file, so > orc writer is only touched by one thread (never

Re: ORC NPE while writing stats

2015-09-02 Thread David Capwell
We have multiple threads writing, but each thread works on one file, so orc writer is only touched by one thread (never cross threads) On Sep 2, 2015 11:18 AM, "Owen O'Malley" wrote: > I don't see how it would get there. That implies that minimum was null, > but the count was non-zero. > > The Co

Re: ORC NPE while writing stats

2015-09-02 Thread Owen O'Malley
I don't see how it would get there. That implies that minimum was null, but the count was non-zero. The ColumnStatisticsImpl$StringStatisticsImpl.serialize looks like: @Override OrcProto.ColumnStatistics.Builder serialize() { OrcProto.ColumnStatistics.Builder result = super.serialize(); OrcPr

Testing a HiveStoragePredicateHandler

2015-09-02 Thread Luke Lovett
I'm writing a HiveStoragePredicateHandler, and I'm trying to figure out the most appropriate way to write unit tests for the decomposePredicate method. I'm seeking advice for the best way to do this. The way I see it, there seem to be two obvious approaches: 1. Write a query as a string. Run it th

Re: Wrong results from join query in Hive 0.13 and also 1.0 with reproduce.

2015-09-02 Thread Jim Green
Hi Ashutosh, Is Hive-10841 related? from the title of that jira, it sais “where col is not null”caused the issue; however above reproduce did not have that clause. On Wed, Sep 2, 2015 at 2:24 AM, Ashutosh Chauhan wrote: > https://issues.apache.org/jira/browse/HIVE-10841 > > Thanks, > Ashutosh

RE: can we add column type in where clause in a hive query?

2015-09-02 Thread Ryan Harris
the fact that you have other data in the column (like letters) implies that you have the column stored as a string, so use a regex. SELECT CAST(mycol as BIGINT) WHERE my mycol RLIKE '^-?[0-9.]+$' From: Mohit Durgapal [mailto:durgapalmo...@gmail.com] Sent: Wednesday, September 02, 2015 5:09 AM To

Hive - Serializing Query Plans

2015-09-02 Thread Raajay
Hi, >From the documents and code I realize that after Semantic Analysis, QueryPlan.java can be serialized to disk using Thrift (toBinaryString()) method. Now if want to execute the serialized query plan (say, on Tez) what should I do ? By de-serializing the string, I can get back the api.Query ob

Disabling local mode optimization

2015-09-02 Thread Daniel Haviv
Hi, I would like to disable the optimization where a query that just selects data is running without mapreduce (local mode). hive.exec.mode.local.auto is set to false but hive still runs in local mode for some queries. How can I disable local mode completely? Thank you. Daniel

can we add column type in where clause in a hive query?

2015-09-02 Thread Mohit Durgapal
I would like to query a hive table only for those rows that have coulmn1 as integer only. Due to some data corruption, without this check I am getting a lot of junk data(mix integer & letters), I would like to get rid of that data by applying something like "where column1 is INT" kind of condition

Re: Wrong results from join query in Hive 0.13 and also 1.0 with reproduce.

2015-09-02 Thread Ashutosh Chauhan
https://issues.apache.org/jira/browse/HIVE-10841 Thanks, Ashutosh On Tue, Sep 1, 2015 at 6:00 PM, Jim Green wrote: > Seems Hive 1.2 fixed this issue. But not sure what is the JIRA related and > the possibility to backport this fix into Hive 0.13? > > > On Tue, Sep 1, 2015 at 5:35 PM, Jim Green