Thanks all, and Matei. TL;DR of the conclusion for my particular case: Qualitatively, while Catalyst[1] tries to mitigate learning curve and maintenance burden, it lacks the dynamic programming approach used by Calcite[2] and risks falling into local minima. Quantitatively, there is no reproducible benchmark, that fairly compares Optimizer frameworks, apples to apples (excluding execution).
References: [1] - https://amplab.cs.berkeley.edu/wp-content/uploads/2015/03/SparkSQLSigmod2015.pdf [2] - https://arxiv.org/pdf/1802.10233.pdf On Mon, Jan 13, 2020 at 5:37 PM Matei Zaharia <matei.zaha...@gmail.com> wrote: > I’m pretty sure that Catalyst was built before Calcite, or at least in > parallel. Calcite 1.0 was only released in 2015. From a technical > standpoint, building Catalyst in Scala also made it more concise and easier > to extend than an optimizer written in Java (you can find various > presentations about how Catalyst works). > > Matei > > > On Jan 13, 2020, at 8:41 AM, Michael Mior <mm...@apache.org> wrote: > > > > It's fairly common for adapters (Calcite's abstraction of a data > > source) to push down predicates. However, the API certainly looks a > > lot different than Catalyst's. > > -- > > Michael Mior > > mm...@apache.org > > > > Le lun. 13 janv. 2020 à 09:45, Jason Nerothin > > <jasonnerot...@gmail.com> a écrit : > >> > >> The implementation they chose supports push down predicates, Datasets > and other features that are not available in Calcite: > >> > >> https://databricks.com/glossary/catalyst-optimizer > >> > >> On Mon, Jan 13, 2020 at 8:24 AM newroyker <newroy...@gmail.com> wrote: > >>> > >>> Was there a qualitative or quantitative benchmark done before a design > >>> decision was made not to use Calcite? > >>> > >>> Are there limitations (for heuristic based, cost based, * aware > optimizer) > >>> in Calcite, and frameworks built on top of Calcite? In the context of > big > >>> data / TCPH benchmarks. > >>> > >>> I was unable to dig up anything concrete from user group / Jira. > Appreciate > >>> if any Catalyst veteran here can give me pointers. Trying to defend > >>> Spark/Catalyst. > >>> > >>> > >>> > >>> > >>> > >>> -- > >>> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > >>> > >>> --------------------------------------------------------------------- > >>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >>> > >> > >> > >> -- > >> Thanks, > >> Jason > > > > --------------------------------------------------------------------- > > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > > >