Re: Question: Hive String Concatenation with NULLs - Behavior and Rationale

2025-05-19 Thread Stamatis Zampetakis
Hi Sadegh, If I recall well the behavior of the string concatenation is defined by the SQL standard (ISO/IEC 9075). Section 6.29: General rules 2.b.i: If at least one of S1 and S2 is the null value, then the result of the is the null value. Hive as well as many other DBMS systems strive to rema

Re: Question on Apache Hive + AWS Glue Data Catalog

2025-04-28 Thread David Novogrodsky
Unsubscribe David Novogrodsky david.novogrod...@gmail.com http://www.linkedin.com/in/davidnovogrodsky On Fri, Apr 25, 2025 at 9:34 AM Sungwoo Park wrote: > Hello, > > I am wondering if anyone uses Apache Hive 3 or 4 with AWS Glue Data > Catalog. There is a git repository for this purpose: > >

Re: Question: Disabling HMS S3 access when running as Spark sidecar

2025-01-30 Thread Mich Talebzadeh
Check your hive-site.xml What is this set to hive.metastore.uris thrift://rhes75:9083 Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore. if you find any S3-related configurations (like fs.defaultFS pointing to S3) in hive-site.xml, tha

Re: Question related to reuse of BytesColumnVector.vector[][].

2024-10-02 Thread Stamatis Zampetakis
Hello, The error looks like a bug and is data/query specific thus I assume reproducible. I would suggest filing a JIRA ticket with as many details as possible (query, DDLs, logs, plans, data) to reproduce the issue. Best, Stamatis On Sun, Sep 29, 2024 at 7:51 AM lisoda wrote: > > Currently, whe

Re: Question on Hive Metastore catalog support

2023-11-16 Thread Butao Zhang
Hi, maybe you can check this ticket https://issues.apache.org/jira/browse/HIVE-26227 Thanks, Butao Zhang Replied Message | From | Flavio Junqueira | | Date | 11/15/2023 17:26 | | To | | | Subject | Question on Hive Metastore catalog support | Hello there, I'm interested in underst

Re: question about a beeline variable

2022-02-27 Thread Bitfox
I got the idea it's the null value in Hive. 0: jdbc:hive2://localhost:1/default> select size(null); +--+ | _c0 | +--+ | -1 | +--+ Thanks On Sun, Feb 27, 2022 at 4:02 PM Bitfox wrote: > what does this -1 value mean? > > > set mapred.reduce.tasks; > > +--

Re: Question regarding lock manager

2021-09-07 Thread Antoine DUBOIS
lege.synchronizer however this change seems very recent and there's no reference to such key in documentation either. Thank you for the tips ;-) De: "Rajesh Balamohan" À: user@hive.apache.org Envoyé: Mardi 7 Septembre 2021 00:46:43 Objet: Re: Question regarding lock mana

Re: Question regarding lock manager

2021-09-06 Thread Rajesh Balamohan
. > > -- > *De: *"Antoine DUBOIS" > *À: *user@hive.apache.org > *Envoyé: *Lundi 6 Septembre 2021 10:03:59 > *Objet: *Re: Question regarding lock manager > > Hello Alan, > Thank you for your answer, > I'm pretty sure I've respected

Re: Question regarding lock manager

2021-09-06 Thread Antoine DUBOIS
sible for the use of zookeeper by hiveserver2. However why using ZooKeeper is useful or relevant is another story I can't tell. Regards - Mail original - De: "Jan Fili" À: user@hive.apache.org Envoyé: Lundi 6 Septembre 2021 16:33:28 Objet: Re: Question regarding lock manag

Re: Question regarding lock manager

2021-09-06 Thread Antoine DUBOIS
À: user@hive.apache.org Envoyé: Lundi 6 Septembre 2021 10:03:59 Objet: Re: Question regarding lock manager Hello Alan, Thank you for your answer, I'm pretty sure I've respected the guide provided and did not set any discovery service. However I'm trying to setup ranger as well a

Re: Question regarding lock manager

2021-09-06 Thread Jan Fili
version but I cannot find a proper > compatibility matrix for all hadoop ecosystem. > Thank you very much. > > Antoine > > > De: "Alan Gates" > À: user@hive.apache.org > Envoyé: Vendredi 3 Septembre 2021 17:51:41 > Objet

Re: Question regarding lock manager

2021-09-06 Thread Antoine DUBOIS
Septembre 2021 17:51:41 Objet: Re: Question regarding lock manager You do not need ZooKeeper to use ACID in Hive. The first thing I would check is that you have configured your system as described on this page: [ https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions | https

Re: Question regarding lock manager

2021-09-03 Thread Alan Gates
You do not need ZooKeeper to use ACID in Hive. The first thing I would check is that you have configured your system as described on this page: https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions. Also, make sure you have not set hive.lock.manager to zookeeper. There are other fea

Re: Question on Hive metastore thrift uri

2019-06-25 Thread Alan Gates
It depends on how you configure the system. If you are using HS2 you can configure it to talk directly to the metastoredb (by providing it with the JDBC connection information and setting the metastore thrift url to localhost) or to talk through a metastore server instance (by not providing the JD

Re: Re: Question about INSERT OVERWRITE TABLE with dynamic partition

2018-10-25 Thread anci_...@yahoo.com
ate: 2018-10-25 08:34 To: user Subject: Re: Question about INSERT OVERWRITE TABLE with dynamic partition A logical explanation could be:- In the first query, you are telling hive which partition to overwrite, so a step which actually deletes the partition data and overwrites it with the query res

Re: Question about INSERT OVERWRITE TABLE with dynamic partition

2018-10-24 Thread Tanvi Thacker
A logical explanation could be:- In the first query, you are telling hive which partition to overwrite, so a step which actually deletes the partition data and overwrites it with the query result, knows that which partition to delete and there is an empty result/file to move. but for the second qu

Re: Re: Question about OVER clause

2018-09-27 Thread anci_...@yahoo.com
'-MM-dd') Thanks! anci_...@yahoo.com From: Alan Gates Date: 2018-09-22 07:19 To: user; anci_sun Subject: Re: Question about OVER clause This article might be helpful. It's for SQL Server, but the semantics should be similar. https://www.sqlpassion.at/archive/2015/0

Re: Question about OVER clause

2018-09-21 Thread Alan Gates
This article might be helpful. It's for SQL Server, but the semantics should be similar. https://www.sqlpassion.at/archive/2015/01/22/sql-server-windowing-functions-rows-vs-range/ Alan. On Wed, Sep 19, 2018 at 6:47 AM 孙志禹 wrote: > Dears, >What is the difference between *ROW BETWEEN* and *

Re: Question about efficiency of SELECT DISTINCT

2018-07-02 Thread Furcy Pin
Hi, They are rigorously equivalent. You can see this with the following queries: CREATE TABLE t1 (a INT, b INT, c INT) ; EXPLAIN SELECT DISTINCT a,b,c FROM t1 ; EXPLAIN SELECT a,b,c FROM t1 GROUP BY a,b,c ; Both queries will return the exact same query plan: Stage-0 Fetch Operator

Re: Question on accessing LLAP as data cache from external containers

2018-02-02 Thread Gopal Vijayaraghavan
> For example, a Hive job may start Tez containers, which then retrieve data > from LLAP running concurrently. In the current implementation, this is > unrealistic That is how LLAP was built - to push work from Tez to LLAP vertex by vertex, instead of an all-or-nothing implementation. Here ar

Re: Question on accessing LLAP as data cache from external containers

2018-01-31 Thread Sungwoo Park
Thanks for the link. My question was how to access LLAP daemon from Containers to retrieve data for Hive jobs. For example, a Hive job may start Tez containers, which then retrieve data from LLAP running concurrently. In the current implementation, this is unrealistic (because every task can be ju

Re: Question on accessing LLAP as data cache from external containers

2018-01-29 Thread Jörn Franke
Are you looking for sth like this: https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html To answer your original question: why not implement the whole job in Hive? Or orchestrate using oozie some parts in mr and some in Huve. > On 30. Jan 2018, at

Re: question on setting up llap

2017-05-10 Thread Yi Cheng
I have installed python 2.7.13 Still the same error. I am having 3 node. (a,b,c) When the SliderAppMaster is running on a, this can work normally. But when the SliderAppMaster is started on b or c, this will throw the error above. 2017-05-10 10:56 GMT-07:00 Gopal Vijayaraghavan : > > > for the

Re: question on setting up llap

2017-05-10 Thread Gopal Vijayaraghavan
> for the slider 0.92, the patch is already applied, right? Yes, except it has been refactored to a different place. https://github.com/apache/incubator-slider/blob/branches/branch-0.92/slider-agent/src/main/python/agent/NetUtil.py#L44 Cheers, Gopal

Re: question on setting up llap

2017-05-10 Thread Yi Cheng
I am going to upgrade Python from 2.6.6 to 2.7.13 for the slider 0.92, the patch is already applied, right? the patch file, it is following, adding some try blocks: diff --git slider-agent/src/main/python/agent/main.py slider-agent/src/main/python/agent/main.py index 1932a37..2671777 100644 --- s

Re: question on setting up llap

2017-05-09 Thread Gopal Vijayaraghavan
> NetUtil.py:60 - [Errno 8] _ssl.c:492: EOF occurred in violation of protocol The error is directly related to the SSL verification error - TLSv1.0 vs TLSv1.2. JDK8 defaults to v1.2 and Python 2.6 defaults to v1.0. Python 2.7.9 + the patch in 0.92 might be needed to get this to work. AFAIK, t

Re: question on setting up llap

2017-05-09 Thread Yi Cheng
the jira's error is [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581) it is a bit different, what i get is NetUtil.py:60 - [Errno 8] _ssl.c:492: EOF occurred in violation of protocol 2017-05-09 17:35 GMT-07:00 Gopal Vijayaraghavan : > > > ERROR 2017-05-09 22:04:56,469 N

Re: question on setting up llap

2017-05-09 Thread Gopal Vijayaraghavan
> ERROR 2017-05-09 22:04:56,469 NetUtil.py:62 - SSLError: Failed to connect. > Please check openssl library versions. … > I am using hive 2.1.0, slider 0.92.0, tez 0.8.5 AFAIK, this was reportedly fixed in 0.92. https://issues.apache.org/jira/browse/SLIDER-942 I'm not sure if the fix in that

Re: Question about partition pruning when there's a type mismatch

2016-11-29 Thread Anthony Hsu
Thanks for the tips, Gopal. I stepped through the code in a debugger and found that in the case of String = String, the predicate was pushed down to the SQL query on the metastore side, whereas in the case of String = Int, the SQL filter pushdown failed, so GenericUDFOPEqual gets evaluated and retu

Re: Question about partition pruning when there's a type mismatch

2016-11-28 Thread Gopal Vijayaraghavan
> I'm wondering why Hive tries to scan all partitions when the quotes are > omitted. Without the quotes, shouldn't 2016-11-28-00 get evaluated as an > arithmetic expression, then get cast to a string, and then partitioning > pruning still occur? The order of evaluation is different - String =

Re: Question : Is there a Automated Hive Database/Schema Designer

2016-07-22 Thread Umesh Prasad
Reposting .. Thanks & Regards Umesh Prasad On Thu, Jul 21, 2016 at 8:04 AM, Umesh Prasad wrote: > Hi All, >Does hive a Automated Database Desginer or has anyone tried building it > ? Something which is equivalent to Vertica's DDB and Microsoft SQL > server's Automated Partitioning Design

Re: Question on Implementing CASE in Hive Join

2016-05-03 Thread Kishore A
t; 18 101 CN Tax Y > 18 101 All Tax Y Smith > > 19 101 CA Tax Y > 19 101 All Tax Y Smith > > 20 101 USA

RE: Question on Implementing CASE in Hive Join

2016-04-27 Thread Kishore A
; > > > CROSS JOIN does not use ON (Hive lets you do that but it not an SQL > standard and it’s actually an INNER JOIN). > > > > 6. > > CASE > > > > CASE is defined by ANSI/ISO and works in Hive the same way it works in HQL > the same way it work

Re: Question on Implementing CASE in Hive Join

2016-04-27 Thread Kishore A
a > > > > *where* a.*type* = b.*type* > > *and* a.code like *case* b.code *when* 'ALL' *then* '%' > *else* b.code *end* > > *and* a.indicator like *case* b.indicator *when* 'ALL' *then* '%'

RE: Question on Implementing CASE in Hive Join

2016-04-20 Thread Markovitz, Dudu
b.indicator end ; Dudu From: Kishore A [mailto:kishore.atmak...@gmail.com] Sent: Wednesday, April 20, 2016 5:04 PM To: user@hive.apache.org Subject: Re: Question on Implementing CASE in Hive Join Hi Dudu, Thank you for sending queries around this. I have run these queries and below are the

Re: Question on Implementing CASE in Hive Join

2016-04-20 Thread Kishore A
= b.code > > > > *where* b.code != 'ALL' > > *and* b.indicatior = 'ALL' > > > > *union* *all* > > > > *select* b.code > >,b.*value* > > > > *from*b > >

RE: Question on Implementing CASE in Hive Join

2016-04-19 Thread Markovitz, Dudu
x27; union all select b.code ,b.value fromb left join a on a.type = b.type where b.code = 'ALL' and b.indicatior = 'ALL' ; From: Kishore A [mailto:kishore.atmak...@gmail.com] Se

Re: Question on Implementing CASE in Hive Join

2016-04-19 Thread Kishore A
Hi Dudu, Actually we use both fields from left and right tables, I mentioned right table just for my convenience to check whether ALL from right table can be pulled as per join condition match. One more reason why we use left join is we should not have extra columns after join. Kishore On Tue

RE: Question on Implementing CASE in Hive Join

2016-04-19 Thread Markovitz, Dudu
Before dealing with the technical aspect, can you please explain what is the point of using LEFT JOIN without selecting any field from table A? Thanks Dudu From: Kishore A [mailto:kishore.atmak...@gmail.com] Sent: Tuesday, April 19, 2016 2:29 PM To: user@hive.apache.org Subject: Question on Imp

Re: Question about hive-jdbc

2015-10-21 Thread Alan Gates
The way to keep track of when things are getting done in Hive is to check the JIRA, https://issues.apache.org/jira/browse/HIVE I'm not aware of anyone working on those issues at the moment, but a search of the JIRA will tell you if anyone has filed a bug on it. Alan. Hafiz Mujadid

Re: Question about PredicateTransitivePropagate

2015-08-19 Thread 孙若曦
To: "user"; "dev"; Cc: "孙若曦"; Subject: Re: Question about PredicateTransitivePropagate >select * from t1 join t2 on t1.col = t2.col where t1.col = 1; > Is rule PredicateTransitivePropagate supposed to propagate predicate >"t1.col = 1" to t2 via

Re: Question about PredicateTransitivePropagate

2015-08-18 Thread Gopal Vijayaraghavan
>select * from t1 join t2 on t1.col = t2.col where t1.col = 1; > Is rule PredicateTransitivePropagate supposed to propagate predicate >"t1.col = 1" to t2 via join condition t1.col = t2.col? > Assuming so, I found that the predicate "t1.col = 1" has not been pushed >down to table scan of t1, thus Pr

Re: Question about bushy join in hive CBO

2015-05-11 Thread Ruoxi Sun
Thank you, Ashutosh. That's very informative. I appreciate that! *Rossi* 2015-05-12 9:08 GMT+08:00 Ashutosh Chauhan : > Hi Rossi, > > Historically, we used LoptOptimizeJoinRule of Calcite to do join > reordering. This does a greedy search on join order search space to find a > join order which

Re: Question about bushy join in hive CBO

2015-05-11 Thread Ashutosh Chauhan
Hi Rossi, Historically, we used LoptOptimizeJoinRule of Calcite to do join reordering. This does a greedy search on join order search space to find a join order which is atleast as good as original join order of query. Goodness being in term of estimated cost and not globally optimal because of gr

Re: Question on MAPJOIN Vs JOIN performance

2015-04-22 Thread Harsha HN
Hi, Thanks for your reply. I will go through the link. By the way my hive version is 0.12 Thanks, Harsha On Fri, Apr 17, 2015 at 4:16 AM, Lefty Leverenz wrote: > Harsha, that document is from 2010. What version of Hive are you using? > > Here's some up-to-date information in the Hive wiki: J

Re: Question on MAPJOIN Vs JOIN performance

2015-04-16 Thread Lefty Leverenz
Harsha, that document is from 2010. What version of Hive are you using? Here's some up-to-date information in the Hive wiki: Join Optimimzation . -- Lefty On Thu, Apr 16, 2015 at 2:38 AM, Harsha HN <99harsha.h..

RE: question on create database

2015-04-02 Thread Mich Talebzadeh
[mailto:leftylever...@gmail.com] Sent: 02 April 2015 21:56 To: user@hive.apache.org Subject: Re: question on create database Could you use SQL standards based authorization to deny CREATE TABLE privileges to everybody except the database owner, and then make people ask the owner to create

Re: question on create database

2015-04-02 Thread Lefty Leverenz
Could you use SQL standards based authorization to deny CREATE TABLE privileges to everybody except the database owner, and then make people ask the owner to create tables for them? -- Lefty On Thu, Apr 2, 2015 at 4:44 PM, Chen Song wrote: > Got it. Thanks. > > On Thu, Apr 2, 2015 at 11:29 AM,

Re: question on create database

2015-04-02 Thread Chen Song
Got it. Thanks. On Thu, Apr 2, 2015 at 11:29 AM, Alan Gates wrote: > When someone creates a table in your 'abc' database it should by default > be in '/my/preferred/directory/_tablename_'. However, users can specify > locations for their tables which may not be in that directory. AFAIK > there

Re: question on create database

2015-04-02 Thread Alan Gates
When someone creates a table in your 'abc' database it should by default be in '/my/preferred/directory/_tablename_'. However, users can specify locations for their tables which may not be in that directory. AFAIK there's no way to prevent that. Alan. Chen Song

Re: Question on varchar in 0.12.0 version of nive

2014-12-15 Thread Jason Dere
Varchar was only added to Hive in 0.12, before that there was only string if you wanted to deal with string types. Varchar will enforce the max character length, truncating the string value if necessary. We've tried to make it as compatible with string as possible. One thing about varchars, if y

Re: Question on varchar in 0.12.0 version of nive

2014-12-15 Thread Gayathri Swaroop
> > I am a newbie to hive. I am trying to query an oracle table in hive. The > data is in hadoop and i have created similar varchar columns in my external > table. Would like to know what are the limitations of varchar over string. > And why people prefer string over varchar. > > Thanks, > Gayathri

Re: Question

2014-12-06 Thread Gabriel Eisbruch
e issues >>>> both performance and semantically . Hence, there is a whole world of NoSQL >>>> databases out there that have been developed that are not row-column >>>> structured. These databases can handle more schema-less/unstructured >>>> objects a

Re: Question

2014-12-05 Thread Moore, Douglas
3 Dec 2014 20:59:46 -0500 To: "user@hive.apache.org<mailto:user@hive.apache.org>" mailto:user@hive.apache.org>> Subject: RE: Question MapReduce can be used for both structure and unstructured data. Hive is a storage and retrieval mechanism (e.g. database). The trouble

Re: Question

2014-12-03 Thread Mohan Krishna
oquently manipulate your information. >>> I would check out the Wikipedia page on NoSQL databases and focus on >>> Key - Value, Columnar, or Document databases. >>> >>> -- >>> Date: Thu, 4 Dec 2014 07:06:16 +0530 >>> Subject: Re: Question >>> From

Re: Question

2014-12-03 Thread Gabriel Eisbruch
objects and will allow you to more eloquently manipulate your information. >> I would check out the Wikipedia page on NoSQL databases and focus on >> Key - Value, Columnar, or Document databases. >> >> -- >> Date: Thu, 4 Dec 2

Re: Question

2014-12-03 Thread Mohan Krishna
- > Date: Thu, 4 Dec 2014 07:06:16 +0530 > Subject: Re: Question > From: mohan.25fe...@gmail.com > To: user@hive.apache.org > > > Thanks Gabriel for the prompt response > > I see in online blogs saying MapReduce for Unstructured Data , Pig for > Semi St

RE: Question

2014-12-03 Thread Bill Busch
check out the Wikipedia page on NoSQL databases and focus on Key - Value, Columnar, or Document databases. Date: Thu, 4 Dec 2014 07:06:16 +0530 Subject: Re: Question From: mohan.25fe...@gmail.com To: user@hive.apache.org Thanks Gabriel for the prompt response I see in online blogs saying

Re: Question

2014-12-03 Thread Mohan Krishna
Thanks Gabriel for the prompt response I see in online blogs saying MapReduce for Unstructured Data , Pig for Semi Sturctured Data and Hive is only for Structured Data. Can you please justify this? Thanks in advance On Thu, Dec 4, 2014 at 6:56 AM, Gabriel Eisbruch wrote: > Hi Mohan, >W

Re: Question

2014-12-03 Thread Gabriel Eisbruch
Hi Mohan, We are using hive for unstructured (or semi structured data) using map columns, for example, we use for fixed data standard columns and form dynamic data map columns. Gabriel. 2014-12-03 22:19 GMT-03:00 Mohan Krishna : > Hive is for only structured data or it handles Unstructured d

Re: Question

2014-12-03 Thread Mohan Krishna
Hive is for only structured data or it handles Unstructured data as well ? On Thu, Dec 4, 2014 at 6:49 AM, Mohan Krishna wrote: > Hive is for only structured data or it handles Unstructured data as well ? >

Re: question on HIVE-5891

2014-11-18 Thread Chen Song
I haven't found a workaround yet. On Thu, Nov 13, 2014 at 11:25 AM, Stéphane Verlet wrote: > Chen > > Did you find a workarround ? Anybody else have a suggestion ? > > Thank you > > Stephane > > On Mon, Aug 4, 2014 at 9:00 AM, Chen Song wrote: > >> I am using cdh5 distribution and It doesn't lo

Re: question on HIVE-5891

2014-11-13 Thread Stéphane Verlet
Chen Did you find a workarround ? Anybody else have a suggestion ? Thank you Stephane On Mon, Aug 4, 2014 at 9:00 AM, Chen Song wrote: > I am using cdh5 distribution and It doesn't look like this jira > > https://issues.apache.org/jira/browse/HIVE-5891 > > is backported into cdh 5.1.0. > > Is

Re: Question re: Concurrency

2014-10-14 Thread Bing Jiang
conf/hive-default.xml.template: hive.support.concurrency conf/hive-default.xml.template- false Hive does not support this feature as default. But you can enable it thru some settings. 2014-10-14 10:38 GMT+08:00 Time Less : > I am looking at this page: > https://cwiki.apache.org/confluence/di

Re: question about hive sql

2014-04-21 Thread Shengjun Xin
You need to check the container log for the details On Tue, Apr 22, 2014 at 10:27 AM, EdwardKing wrote: > I use hive under hadoop 2.2.0, first I start hive > [hadoop@master sbin]$ hive > 14/04/21 19:06:32 INFO Configuration.deprecation: > mapred.input.dir.recursive is deprecated. Instead, use

Re: Question about running Hive on Tez

2013-12-18 Thread Lefty Leverenz
Should hive-on-tez-conf.txt be added to the wiki, or is it not soup yet? -- Lefty On Mon, Dec 16, 2013 at 10:25 AM, Cheolsoo Park wrote: > Closing the loop. We identified the issue with help from the Tez team. It > was mis-configured mapreduce.reduce.cpu.vcores that caused problems. > > If anyo

Re: Question about running Hive on Tez

2013-12-16 Thread Cheolsoo Park
Closing the loop. We identified the issue with help from the Tez team. It was mis-configured mapreduce.reduce.cpu.vcores that caused problems. If anyone who tries Hive on Tez with EMR Hadoop and sees that reducers are stuck, this

Re: Question about Hive on Tez

2013-12-14 Thread Cynthia Smith
looks good to me i saw no issues On Wednesday, December 11, 2013 6:40 PM, Zhenxiao Luo wrote: Excuse me. May I ask a question about Hive on Tez? We just started evaluating Hive on Tez.  Would like to know, is Hive on Tez development done yet? Is there any documentation we could reference to

Re: Question about running Hive on Tez

2013-12-13 Thread Gunther Hagleitner
dev on bcc Zhenxiao, Cool you got it set up. The query runs a full order by before the limit - are you sure it's not just still running? Hive on Tez prints "total tasks/completed tasks", so no update may mean none of the reduce tasks have finished yet. If not, it'd be great to see the yarn logs

Re: Question on correlation optimizer

2013-12-10 Thread Avrilia Floratou
Hi Yin, Thanks for the detailed explanation. I have one more question for the correlation optimizer. When I ran explain in query 17 I get the plan for stage 1 where the bulk of the time goes. I can understand what is happening in the map phase but the reduce phase confuses me when the optimizer ki

Re: Question on correlation optimizer

2013-12-10 Thread Yin Huai
Hi Avrilia, It is caused by distinct aggregations in TPC-H Q21. Because Hive adds those distinct columns in the key columns of ReduceSinkOperators and correlation optimizer only check exact same key columns right now, this query will not be optimized. The jira of this issue is https://issues.apach

Re: Question on LATERAL VIEW explode

2013-09-28 Thread Rajesh Nagaraju
Hi Gary I would say write an UDF Thanks and Regards Rajesh On Sun, Sep 29, 2013 at 5:16 AM, Gary Zhao wrote: > Hello > > Do you know if my scenario is supported or not, or how I may do it. > Basically, I have an array in a recode, like > > [1, 2, 3] > > What I want is to explode it twice so I

Re: question about partition table in hive

2013-09-13 Thread Jagat Singh
Adding to Sanjay's reply The only thing left after flume has added partitions is to tell hive metastore to update partition information. which you can do via add partition command Then you can read data via hive straight away. On Sat, Sep 14, 2013 at 10:00 AM, Sanjay Subramanian < sanjay.subr

Re: question about partition table in hive

2013-09-13 Thread Sanjay Subramanian
A couple of days back, Erik Sammer at the Hadoop Hands On Lab at the Cloudera Sessions demonstrated how to achieve dynamic partitioning using Flume and created those partitioned directories on HDFS which are then readily usable by Hive Understanding what I can from the two lines of your mail be

Re: question about partition table in hive

2013-09-13 Thread Dean Wampler
Flume might be able to invoke Hive to do this as the data is ingested, but I don't know anything about Flume. Brent has a nice blog post describing many of the details of partitioning. http://www.brentozar.com/archive/2013/03/introduction-to-hive-partitioning/ We also cover them in our book. The

Re: question about partition table in hive

2013-09-13 Thread Stephen Sprague
and have you done any analysis on this yet using the Hive documentation that's publicly available? if you show some initiative yourself you're more likely to get others to join your cause. :) So what have you tried before asking us for help? On Thu, Sep 12, 2013 at 6:55 PM, ch huang wrote: >

Re: question about partition table in hive

2013-09-13 Thread Nitin Pawar
You will need to define a partition column like date or hour something like this. Then configure flume to rollover filee/directories based on your partition column. You will need some kind of cron which will check for the new data being available into a directory or file and then add it as partitio

Re: Question for ORCFileFormat

2013-09-12 Thread Thejas Nair
If you have hive metastore setup, using hcatalog is easy, you just need the jars and hive-site.xml directory in the classpath. Then you can use the hcat input/output formats in your map-reduce program - http://hive.apache.org/docs/hcat_r0.5.0/inputoutput.html On Wed, Sep 11, 2013 at 4:35 PM, Sapt

Re: Question for ORCFileFormat

2013-09-11 Thread Saptarshi Guha
Hi, Thanks, but assuming i can't use HCatalog, or integrating it is difficult, is there an example of using ORC as an outputformat in a mapreduce job? Regards Saptarshi On Wed, Sep 11, 2013 at 1:36 PM, Owen O'Malley wrote: > The easiest way to use it is to use HCatalog, which enables you to r

Re: Question for ORCFileFormat

2013-09-11 Thread Owen O'Malley
The easiest way to use it is to use HCatalog, which enables you to read or write ORC files from MapReduce or Pig. -- Owen On Mon, Sep 9, 2013 at 11:14 AM, Saptarshi Guha wrote: > Hello, > > Are there any examples of writing using ORC as aFileOutputFormat (and then > as a FileInputFormat) in Map

Re: question about get part result from hive

2013-08-19 Thread Stephen Sprague
Maybe its too obvious but there is the "limit" keyword as well. On Sun, Aug 18, 2013 at 9:43 PM, Nitin Pawar wrote: > Hive does support sampling. Please look at > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Sampling > > We normally partition data based on few columns so we av

Re: question about hive SQL

2013-08-19 Thread Sanjay Subramanian
Here is my stab at it. I have not tested it but this should get you started Following points are importat 1. I added a WHERE clause in the sub query to limit he data set by any partition u may have 2. You have to write a collect UDF to use it. Wampler/Capriolo's book in Chapter 13.Functions - r

Re: question about get part result from hive

2013-08-18 Thread Nitin Pawar
Hive does support sampling. Please look at https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Sampling We normally partition data based on few columns so we avoid these situations On Mon, Aug 19, 2013 at 8:46 AM, ch huang wrote: > hi,all: > i have a problem ,i have a hive ta

Re: Question regarding external table and csv in NFS

2013-07-19 Thread Mainak Ghosh
Hey Everybody, This problem still did not work. Any advice? Thanks and Regards, Mainak. From: Mainak Ghosh/Almaden/IBM@IBMUS To: user@hive.apache.org, Date: 07/17/2013 02:12 PM Subject:Re: Question regarding external table and csv in NFS Hey Saurabh, I tried this command

Re: Question regarding external table and csv in NFS

2013-07-17 Thread Mainak Ghosh
rabh M To: user@hive.apache.org, Date: 07/17/2013 02:06 PM Subject:Re: Question regarding external table and csv in NFS Hi Mainak, Can you try using this:  create external table outside_supplier (S_SUPPKEY INT, S_NAME STRING, S_ADDRESS STRING, S_NATIONKEY INT, S_PHONE STRING, S_ACCTB

Re: Question regarding external table and csv in NFS

2013-07-17 Thread Saurabh M
Hi Mainak, Can you try using this: create external table outside_supplier (S_SUPPKEY INT, S_NAME STRING, S_ADDRESS STRING, S_NATIONKEY INT, S_PHONE STRING, S_ACCTBAL DOUBLE, S_COMMENT STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE LOCATION 'file:///mnt/h/tpc-h-impala/da

Re: Question regarding nested complex data type

2013-06-21 Thread Dean Wampler
;) I actually thought it was a clever choice on Hive's part. There's no real need for the 2nd tier separators, despite the nested collections! However, it's still tricky to know what Hive expects when you're generating table data with other apps. dean On Thu, Jun 20, 2013 at 9:34 PM, Stephen Spr

Re: Question regarding nested complex data type

2013-06-20 Thread Stephen Sprague
look at it the other around if you want. knowing an array of a two element struct is topologically the same as a map - they darn well better be the same. :) On Thu, Jun 20, 2013 at 7:00 PM, Dean Wampler wrote: > It's not as "simple" as it seems, as I discovered yesterday, to my > surprise. I

Re: Question regarding nested complex data type

2013-06-20 Thread Dean Wampler
It's not as "simple" as it seems, as I discovered yesterday, to my surprise. I created a table like this: CREATE TABLE t ( name STRING, stuff ARRAY>); I then used an insert statement to see how Hive would store the records, so I could populate the real table with another process. Hive used

Re: Question regarding nested complex data type

2013-06-20 Thread Stephen Sprague
you only get three. field separator, array elements separator (aka collection delimiter), and map key/value separator (aka map key delimiter). when you nest deeper then you gotta use the default '^D', '^E' etc for each level. At least that's been my experience which i've found has worked succes

Re: Question regarding nested complex data type

2013-06-20 Thread neha
Thanks a lot for your reply, Stephen. To answer your question - I was not aware of the fact that we could use delimiter (in my example, '|') for first level of nesting. I tried now and it worked fine. My next question - Is there any way to provide delimiter in DDL for second level of nesting? Than

Re: Question regarding nested complex data type

2013-06-20 Thread Stephen Sprague
its all there in the documentation under "create table" and it seems you got everything right too except one little thing - in your second example there for 'sample data loaded' - instead of '^B' change that to '|' and you should be good. That's the delimiter that separates your two array elements

Re: Question about weekofyear(string date)

2013-06-13 Thread Robert Li
Thanks! On Thu, Jun 13, 2013 at 6:30 PM, Darren Yin wrote: > It's all right here: monday to monday and has a concept of the first > "full" week too. > https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFWeekOfYear.java > > > On Thu, Jun 13, 2013 at 2:44 PM, Ro

Re: Question about weekofyear(string date)

2013-06-13 Thread Darren Yin
It's all right here: monday to monday and has a concept of the first "full" week too. https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFWeekOfYear.java On Thu, Jun 13, 2013 at 2:44 PM, Robert Li wrote: > Hi All > > For this UDF, does it consider the week to

RE: Question about how to add the debug info into the hive core jar

2013-03-20 Thread java8964 java8964
length not same. But I have problem to make my new jar to be loaded by hadoop.2) Enable remote debug. There is very limited example on the internet about how to enable the hive server side MR jobs remote debug, even some wiki pages claim it is doable, but without concrete examples. Thanks F

Re: Question about how to add the debug info into the hive core jar

2013-03-20 Thread Abdelrhman Shettia
Hi Yong, Have you tried running the H query in debug mode. Hive log level can be changed by passing the following conf while hive client is running. #hive -hiveconf hive.root.logger=ALL,console -e " DDL statement ;" #hive -hiveconf hive.root.logger=ALL,console -f ddl.sql ; Hope this helps

Re: question about machine learning on Hive

2013-01-17 Thread Robin Morris
.com From: Igor Tatarinov mailto:i...@decide.com>> Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" mailto:user@hive.apache.org>> Date: Thursday, January 17, 2013 1:29 PM To: "user@hive.apache.org<mailto:user@hive.apache.org>" mailto:user

Re: question about machine learning on Hive

2013-01-17 Thread Igor Tatarinov
Here is how Twitter does it with Pig: http://www.umiacs.umd.edu/~jimmylin/publications/Lin_Kolcz_SIGMOD2012.pdf We use a similar approach and I think that Pig, being somewhat lower-level with better support of nested objects, is a better tool than Hive. It should be possible to do something simila

RE: question on output hive table to file

2012-09-05 Thread Tony Burton
a zhang [mailto:zuo...@gmail.com] Sent: 07 August 2012 05:58 To: user@hive.apache.org Subject: Re: question on output hive table to file Thanks so much! that did work. I have 200+ columns so it is quite an ugly thing. No shortcut? On Mon, Aug 6, 2012 at 9:50 PM, Vinod Singh mailto:vi...

RE: Question about query result storage

2012-08-09 Thread Venkatesh Kavuluri
-0400 > From: pipeha...@gmail.com > To: user@hive.apache.org > Subject: Re: Question about query result storage > > Oh, actually is > hive -S -f some_query.q > some_query.log > > On 08/09/2012 05:41 PM, Yue Guan wrote: > > We always do something like this: > &

  1   2   >