Re: Which [open-souce] SQL engine atop Hadoop?

2015-01-30 Thread Samuel Marks
Thanks for the advice Koert: when everything is in the same essential data-store (HDFS), can't I just run whatever complex tools I'm whichever paradigm they like? E.g.: GraphX, Mahout &etc. Also, what about Tajo or Drill? Best, Samuel Marks http://linkedin.com/in/samuelmarks PS: Spark-SQL is

Re: Union all with a field 'hard coded'

2015-01-30 Thread Xuefu Zhang
Use column alias: INSERT OVERWRITE TABLE all_dictionaries_ext SELECT name, id, category FROM dictionary UNION ALL SELECT NAME, ID, "CAMPAIGN" as category FROM md_campaigns On Fri, Jan 30, 2015 at 1:41 PM, Philippe Kernévez wrote: > Hi all, > > I would like to do union all with a fiel

Union all with a field 'hard coded'

2015-01-30 Thread Philippe Kernévez
Hi all, I would like to do union all with a field that is hardcoded in the request. INSERT OVERWRITE TABLE all_dictionaries_ext SELECT name, id, category FROM dictionary UNION ALL SELECT NAME, ID, "CAMPAIGN" FROM md_campaigns Name type is String Id type is int Category type is strin

Re: Which [open-souce] SQL engine atop Hadoop?

2015-01-30 Thread Koert Kuipers
since you require high-powered analytics, and i assume you want to stay sane while doing so, you require the ability to "drop out of sql" when needed. so spark-sql and lingual would be my choices. low latency indicates phoenix or spark-sql to me. so i would say spark-sql On Fri, Jan 30, 2015 at

Re: Which [open-souce] SQL engine atop Hadoop?

2015-01-30 Thread Samuel Marks
HAWQ is pretty nifty due to its full SQL compliance (ANSI 92) and exposing both JDBC and ODBC interfaces. However, although Pivotal does open-source a lot of software , I don't believe they open source Pivotal HD: HAWQ. So that doesn't meet my requirements. I should note

Re: Which [open-souce] SQL engine atop Hadoop?

2015-01-30 Thread Siddharth Tiwari
Have you looked at HAWQ from Pivotal ? Sent from my iPhone > On Jan 30, 2015, at 4:27 AM, Samuel Marks wrote: > > Since Hadoop came out, there have been various commercial and/or open-source > attempts to expose some compatibility with SQL. Obviously by posting here I > am not expecting an un

Re: Which [open-souce] SQL engine atop Hadoop?

2015-01-30 Thread Samuel Marks
Hi Uli, My use-case is two-fold: generic and "high-powered" analytics. There are various offerings which I can use that will push data back to HDFS at regular intervals. Even Apache Sqoop can do that. However I was thinking that it'd be better to keep everything in the

Re: Partitioned table and Bucket Map Join

2015-01-30 Thread matshyeq
Raised enhancement request: HIVE-9523 As for table join order - wasn't aware of that behaviour (always read the small print! ;). Not the same magnitude of possible improvement but still something. Thanks. ~Macek Subject:RE: COMMERCIAL:Re: Partitio

Which [open-souce] SQL engine atop Hadoop?

2015-01-30 Thread Samuel Marks
Since Hadoop came out, there have been various commercial and/or open-source attempts to expose some compatibility with SQL . Obviously by posting here I am not expecting an unbiased answer. Seeking an SQL-on-Hadoop offering which provides: low-la

RE: COMMERCIAL:Re: Partitioned table and Bucket Map Join

2015-01-30 Thread Matthew Dixon
Not sure if this is going to solve your problems and I agree with your point about partition join optimisation but if your query is indeed an inner join (and not A LEFT OUTER JOIN B) then you should arrange your table in order from smallest to biggest. See this section on the hive wiki: https: