Thanks Matei, Any thoughts of providing Standalone SharkServer equivalent on SparkSQL?
Manoj On Sun, Mar 30, 2014 at 7:35 PM, Matei Zaharia <[email protected]>wrote: > Hi Manoj, > > At the current time, for drop-in replacement of Hive, it will be best to > stick with Shark. Over time, Shark will use the Spark SQL backend, but > should remain deployable the way it is today (including launching the > SharkServer, using the Hive CLI, etc). Spark SQL is better for accessing > Hive data within a Spark program though, where its APIs are richer and > easier to link to than the SharkContext.sql2rdd we had previously provided > in Shark. > > So in a nutshell, if you have a Shark deployment today, or need the > HiveServer, then going with Shark will be fine and we will switch out the > backend in a future release (we'll probably create preview of this even > before we're ready to fully switch). If you just want to run SQL queries or > load SQL data within a Spark program, try out Spark SQL. > > Matei > > On Mar 30, 2014, at 4:46 PM, Mayur Rustagi <[email protected]> > wrote: > > +1 Have done a few installations of Shark with customers using Hive, they > love it. Would be good to maintain compatibility with Metastore & QL till > we have substantial reason to break off (like BlinkDB). > > Mayur Rustagi > Ph: +1 (760) 203 3257 > http://www.sigmoidanalytics.com > @mayur_rustagi <https://twitter.com/mayur_rustagi> > > > > On Sun, Mar 30, 2014 at 2:46 AM, Nicholas Chammas < > [email protected]> wrote: > >> This is a great question. We are in the same position, having not >> invested in Hive yet and looking at various options for SQL-on-Hadoop. >> >> >> On Sat, Mar 29, 2014 at 9:48 PM, Manoj Samel <[email protected]>wrote: >> >>> Hi, >>> >>> In context of the recent Spark SQL announcement ( >>> http://databricks.com/blog/2014/03/26/Spark-SQL-manipulating-structured-data-using-Spark.html >>> ). >>> >>> If there is no existing investment in Hive/Shark, would it be worth >>> starting a new SQL work using SparkSQL rather than Shark ? >>> >>> * It seems Shark SQL core will use more and more of SparkSQL >>> * From the blog, it seems Shark has baggage from Hive, that is not >>> needed in this case >>> >>> On the other hand, there seems to be two shortcomings of SparkSQL (from >>> a quick scan of blog and doc) >>> >>> * SparkSQL will have less features than Shark/Hive QL, at least for now. >>> * The standalone SharkServer feature will not be available in SparkSQL. >>> >>> Can someone from Databricks shed light on what is the long term roadmap? >>> It will help in avoiding investing in older/two technologies for work with >>> no Hive needs. >>> >>> Thanks, >>> >>> PS: Great work on SparkSQL >>> >>> >> > > -- > You received this message because you are subscribed to the Google Groups > "shark-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/shark-users. > For more options, visit https://groups.google.com/d/optout. > > >
