On Tue, Jul 2, 2013 at 8:39 PM, Edward Capriolo <edlinuxg...@gmail.com> wrote: > What is in a name? :) > > "Which SQL feature you are talking about here, that forces single reducer > and hence should not be supported?" > > Joining on anything besides = comes to mind > > Pretty sure the query mentioned here will not work (without being > re-written) > http://en.wikipedia.org/wiki/SQL > > SELECT isbn, title, price > FROM Book > WHERE price < (SELECT AVG(price) FROM Book) > ORDER BY title;
Don't you think hive should be supporting this ? Don't you think our users would want this ? You can do theta joins without using single reducer (cartesian product can be done in parallel). But that is besides the point. I don't expect hive to be 100% sql compliant. I don't see 100% sql compliance as a goal, but I see more SQL compliance as desirable. That is why I prefer the term Hive-SQL. > Hive-SQL looks like it is trying to convey the idea that hive supports > extensions like T-SQL http://en.wikipedia.org/wiki/Transact-SQL or PL/SQL. > http://www.oracle.com/technetwork/database/features/plsql/index.html. If I refert to something as Oracle-SQL or DB2-SQL, I think people understand that it is a Oracle or DB2 dialect of SQL that I refer to. > Lessons from my mother. > You can't be half a saint. > "considering how much other databases deviate from the standard - > http://troels.arvin.dk/db/rdbms/ . See how much deviation is there for > example in 'limit clause' or the data types supported (and details of > data type support) -" > If all your friends jumped off a bridge would you do it? My friends are very smart, if they jump of the bridge, there is probably a very good reason to do so, and I would seriously consider it. I think hive has many smart friends like DB2, Oracle, teradata, vertica, impala, and even phoenix (https://github.com/forcedotcom/phoenix). As you can see there is a wide range in SQL compliance across products. I don't see anything wrong in saying that hive is "SQL on hadoop". I think I have conveyed everything I wanted to say on this topic. I will stop and listen to what others think before we go from half saints and jumping over the bridge to Hitler :) (http://en.wikipedia.org/wiki/Godwin's_law) (there I said it!!) I am looking forward to hearing if anybody else thinks calling it "Hive-SQL" will make them confuse it for something like PL/SQL. Also want to know if others think calling it HiveQL gives more clarity about it aiming to be "SQL on hadoop" Thanks, Thejas