How does it compare against Shark, and what is the future of Shark with this new module in place?
On Sun, Mar 23, 2014 at 11:49 PM, Evan Chan <e...@ooyala.com> wrote: > Hi Michael, > > Congrats, this is really neat! > > What thoughts do you have regarding adding indexing support and > predicate pushdown to this SQL framework? Right now we have custom > bitmap indexing to speed up queries, so we're really curious as far as > the architectural direction. > > -Evan > > > On Fri, Mar 21, 2014 at 11:09 AM, Michael Armbrust > <mich...@databricks.com> wrote: > >> > >> It will be great if there are any examples or usecases to look at ? > >> > > There are examples in the Spark documentation. Patrick posted and > updated > > copy here so people can see them before 1.0 is released: > > > http://people.apache.org/~pwendell/catalyst-docs/sql-programming-guide.html > > > >> Does this feature has different usecases than shark or more cleaner as > >> hive dependency is gone? > >> > > Depending on how you use this, there is still a dependency on Hive (By > > default this is not the case. See the above documentation for more > > details). However, the dependency is on a stock version of Hive instead > of > > one modified by the AMPLab. Furthermore, Spark SQL has its own > optimizer, > > instead of relying on the Hive optimizer. Long term, this is going to > give > > us a lot more flexibility to optimize queries specifically for the Spark > > execution engine. We are actively porting over the best parts of shark > > (specifically the in-memory columnar representation). > > > > Shark still has some features that are missing in Spark SQL, including > > SharkServer (and years of testing). Once SparkSQL graduates from Alpha > > status, it'll likely become the new backend for Shark. > > > > -- > -- > Evan Chan > Staff Engineer > e...@ooyala.com | >