This is a great question. We are in the same position, having not invested in Hive yet and looking at various options for SQL-on-Hadoop.
On Sat, Mar 29, 2014 at 9:48 PM, Manoj Samel <[email protected]>wrote: > Hi, > > In context of the recent Spark SQL announcement ( > http://databricks.com/blog/2014/03/26/Spark-SQL-manipulating-structured-data-using-Spark.html > ). > > If there is no existing investment in Hive/Shark, would it be worth > starting a new SQL work using SparkSQL rather than Shark ? > > * It seems Shark SQL core will use more and more of SparkSQL > * From the blog, it seems Shark has baggage from Hive, that is not needed > in this case > > On the other hand, there seems to be two shortcomings of SparkSQL (from a > quick scan of blog and doc) > > * SparkSQL will have less features than Shark/Hive QL, at least for now. > * The standalone SharkServer feature will not be available in SparkSQL. > > Can someone from Databricks shed light on what is the long term roadmap? > It will help in avoiding investing in older/two technologies for work with > no Hive needs. > > Thanks, > > PS: Great work on SparkSQL > >
