Hi Julian, Thanks for your reply. I have a question
… where t is a table based on a JDBC data source … Do we agree on that goal? Does that mean that each of adapters (Elastic, Geode, Mongo etc.) have to implement their own version of converters (similar to RexToSqlConverter) ? If it is a flat RexCall of ORs this should be doable, I guess. On Tue, Nov 20, 2018 at 2:14 PM Julian Hyde <[email protected]> wrote: > Thanks Andrei, that’s the discussion I was thinking of. > > In Mykola’s case, I think it would be useful to solve the end-to-end > problem. Given a Calcite query “select … from t where x in (c1, c2, …, > cN)”, where t is a table based on a JDBC data source, ci are constants and > N is large, we want Calcite’s JDBC adapter to send a query similar to > “select … from t where x in (c1, c2, …, cN)” to the JDBC source. > > Do we agree on that goal? > > If we agree on the goal, the next question is how that query should be > represented in the RelNode/RexNode intermediate representation. The choice > of that representation has implications: performance (e.g. whether we hit a > stack-overflow exception or quadratic algorithm), quality (whether we are > forging a new code path that is untested), surface area (are we going to > need to write a lot of new code, for example new planner rules, in order to > achieve parity with existing cases). > > The approach I advocate in > https://issues.apache.org/jira/browse/CALCITE-2630 < > https://issues.apache.org/jira/browse/CALCITE-2630> - representing the IN > clause as a large, flat OR RexCall “x = c1 or x = c2 … or x = cN”, and > having RexToSqlConverter translate that OR into an IN SqlNode - meets those > criteria. (We may need to fix some bugs relating to quadratic performance > or stack depth, but those are worth doing anyway.) > > Are there other approaches that meet the same criteria? The original > proposal - adding IN as a Rex operator - is a significant increase in > surface area, so we would either lose functionality (e.g. not be able to > push filters into the IN list) or find ourselves having to write a lot of > new code and have to fix a lot of new bugs. > > Julian > > > > On Nov 20, 2018, at 10:06 AM, Andrei Sereda <[email protected]> wrote: > > > > Convert SqlInOperator to In-Expression : > > https://issues.apache.org/jira/browse/CALCITE-2630 > > > > Related. full table scans and subQueryThreshold. > > > https://lists.apache.org/thread.html/1a25c956262633f8ef0d224ed76400761f6797c494a21796579eb4f2@%3Cdev.calcite.apache.org%3E > > > > > > > > On Tue, Nov 20, 2018 at 12:08 PM Julian Hyde <[email protected]> wrote: > > > >> I recall contributing to some other conversations about large IN lists > >> over the past few months. Before we jump into a discussion, can you > locate > >> those threads? Also, if there is not a JIRA case, can you please create > one? > >> > >> Julian > >> > >>> On Nov 20, 2018, at 8:23 AM, Mykola Zerniuk <[email protected] > .INVALID> > >> wrote: > >>> > >>> Dear Calcite Administrators, > >>> > >>> my name is Mykola, software engineer from Ukraine. > >>> > >>> I had an issue with Calcite IN operator handling. > >>> > >>> Here is my previous email to you: > >>> > >> > https://mail-archives.apache.org/mod_mbox/calcite-dev/201810.mbox/%3CCAL4PLbiBh1HoP0w_5ScJ1Nnxq%2BNYGP2LO2usxg_17Gs1mYgttA%40mail.gmail.com%3E > >>> > >>> It is really important to us to have an option to left IN operator "as > >>> is" and do not do any conversions. I implemented it a while ago at my > >>> local, and it successfully works in our project. > >>> > >>> Our team would be happy to have your review and contribute it to > Calcite. > >>> > >>> If you have no objections may i create a work item in Jira? I am > >>> following these steps: > >>> https://calcite.apache.org/develop/#contributing > >>> > >>> Thanks a lot, > >>> Mykola > >> > >> > >
