Re: Which [open-souce] SQL engine atop Hadoop?

2015-02-02 Thread Devopam Mittra
hi Samuel, You may wish to evaluate Presto (https://prestodb.io/) , which has an added advantage of being faster than conventional Hive due to no MR jobs being fired. It has a dependency on Hive metastore though , through which it derives the mechanism to execute the queries directly on source file

Re: Which [open-souce] SQL engine atop Hadoop?

2015-02-02 Thread Samuel Marks
Alexander: So would you recommend using Phoenix for all but those kind of queries, and switching to Hive+Tez for the rest? - Is that feasible? Checking their documentation, it looks like it just might be: https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration There is some early work o

Re: Which [open-souce] SQL engine atop Hadoop?

2015-02-02 Thread Saurabh B
This is not open source but we are using Vertica and it works very nicely for us. There is a 1TB community edition but above that it costs money. It has really advanced SQL (analytical functions, etc), works like an RDBMS, has R/Java/C++ SDK and scales nicely. There is a similar option of Redshift

Re: Which [open-souce] SQL engine atop Hadoop?

2015-02-02 Thread Alexander Pivovarov
Apache Phoenix is super fast for queries which filters data by table key, - sub-second latency - has good jdbc driver but has limitations - no full outer join support - inner and left outer join use one computer memory, so it can not join huge table to huge table On Mon, Feb 2, 2015 at 6:59 PM,

Re: Which [open-souce] SQL engine atop Hadoop?

2015-02-02 Thread Alexander Pivovarov
I like Tez engine for hive (aka Stinger initiative) - faster than MR engine. especially for complex queries with lots of nested sub-queries - stable - min latency is 5-7 sec (0 sec for select count(*) ...) - capable to process huge datasets (not limited by RAM as Spark) On Mon, Feb 2, 2015 at 6

Re: Which [open-souce] SQL engine atop Hadoop?

2015-02-02 Thread Samuel Marks
Maybe you're right, and what I should be doing is throwing in connectors so that data from regular databases is pushed into HDFS at regular intervals, wherein my "fancier" analytics can be run across larger data-sets. However, I don't want to decide straightaway, for example, Phoenix + Spark may b

Parquet support for Timestamp in 0.14

2015-02-02 Thread Yang
the parquet spec about logical types and Timestamp specifically, seems to say https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md "TIMESTAMP_MILLIS is used for a combined logical date and time type. It must annotate an int64 that stores the number of milliseconds from the Unix epo

Re: Hive wiki write access

2015-02-02 Thread Alexander Pivovarov
Thank you, Lefty! On Mon, Feb 2, 2015 at 3:59 PM, Lefty Leverenz wrote: > Done. Welcome to the Hive wiki team, Alexander! > > -- Lefty > > On Mon, Feb 2, 2015 at 2:14 PM, Alexander Pivovarov > wrote: > >> Hi Everyone >> >> Can I get write access to hive wiki? >> I need to put descriptions for

Re: Hive wiki write access

2015-02-02 Thread Lefty Leverenz
Done. Welcome to the Hive wiki team, Alexander! -- Lefty On Mon, Feb 2, 2015 at 2:14 PM, Alexander Pivovarov wrote: > Hi Everyone > > Can I get write access to hive wiki? > I need to put descriptions for several UDFs added recently (init_cap, > add_months, last_day, greatest, least) > > Conflu

Hive wiki write access

2015-02-02 Thread Alexander Pivovarov
Hi Everyone Can I get write access to hive wiki? I need to put descriptions for several UDFs added recently (init_cap, add_months, last_day, greatest, least) Confluence username: apivovarov https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF

Re: Union all with a field 'hard coded'

2015-02-02 Thread DU DU
This is a part of standard SQL syntax, isn't it? On Mon, Feb 2, 2015 at 2:22 PM, Xuefu Zhang wrote: > Yes, I think it would be great if this can be documented. > > --Xuefu > > On Sun, Feb 1, 2015 at 6:34 PM, Lefty Leverenz > wrote: > >> Xuefu, should this be documented in the Union wikidoc >> <

Re: Union all with a field 'hard coded'

2015-02-02 Thread Xuefu Zhang
Yes, I think it would be great if this can be documented. --Xuefu On Sun, Feb 1, 2015 at 6:34 PM, Lefty Leverenz wrote: > Xuefu, should this be documented in the Union wikidoc > ? > > Is it relevant for other query clauses?

השב: UDF using third party jar

2015-02-02 Thread Guy Doulberg
i will check that נשלח מסמארטפון ה-Samsung Galaxy שלי. הודעה מקורית מאת: Lukas Nalezenec תאריך:02/02/2015 6:06 PM (GMT+02:00) אל: user@hive.apache.org עותק: נושא: Re: UDF using third party jar Hi, it looks like jar version conflict - you might have two versions of jar conta

ALTER TABLE xyz RECOVER PARTITIONS not working ...

2015-02-02 Thread Joshua Eldridge
I'm hoping someone else has had this problem. I tried searching, but couldn't find anything ... I'm running Hive Version 0.13.1 on EMR I have data files staged externally in s3 in the following folder structure: s3://root-bucket/sourcesystem/source=name/y=2015/m=01/d=XX I have created the table in

Re: UDF using third party jar

2015-02-02 Thread Lukas Nalezenec
Hi, it looks like jar version conflict - you might have two versions of jar containing class "net.sf.uadetector.UserAgentStringParser" on your classpath. Lukas On 2.2.2015 16:51, Guy Doulberg wrote: Hi I am migrating my hive from CDH4 to CDH5(1.2), I had a udf that extracts the useragent v

UDF using third party jar

2015-02-02 Thread Guy Doulberg
Hi I am migrating my hive from CDH4 to CDH5(1.2), I had a udf that extracts the useragent values out of the useragent string, I am using the third-party uadetector library, This UDF worked well in the CDH4 After the migration it stopped working because of: Exception running child : java.lang

Re: Simple way to export data from a Hive table in to Avro?

2015-02-02 Thread Daniel Haviv
I might be missing something here but you could use: Create table newtable stored as avro as select * from oldtable On Mon, Feb 2, 2015 at 3:09 PM, Michael Segel wrote: > Currently using Hive 13.x > > Would like to select from a table that exists and output to an external > file(s) in avro via h

Simple way to export data from a Hive table in to Avro?

2015-02-02 Thread Michael Segel
Currently using Hive 13.x Would like to select from a table that exists and output to an external file(s) in avro via hive. Is there a simple way to do this? From what I’ve seen online, the docs tend to imply you need to know the avro schema when you specify the table. Could you copy from a