Hello, Valentin.

> In general, I don't see a reason to exclude anything (especially joins) from 
> PR. Please finalize the change and pass it to me for review.

I made some minor improvements.
Please, review my PR [1]

[1] https://github.com/apache/ignite/pull/3397

В Вт, 13/02/2018 в 11:47 -0800, Valentin Kulichenko пишет:
> Nikolay,
> 
> Non-collocated joins should be used only if there is no way to collocate. 
> Please read here for more info: 
> https://apacheignite-sql.readme.io/docs/distributed-joins
> 
> As for limitations, I think Vladimir is talking more about syntax related 
> stuff, i.e. what we do or don't support from SQL compliance perspective. We 
> depend on H2 here and therefore don't have full knowledge, so I understand 
> that it takes time to test and document everything. But requirement to have 
> an index for a non-collocated join is introduced by Ignite and, if it's an 
> expected one, should be documented. *Vladimir*, can you please comment on 
> this?
> 
> In general, I don't see a reason to exclude anything (especially joins) from 
> PR. Please finalize the change and pass it to me for review.
> 
> -Val
> 
> On Tue, Feb 13, 2018 at 11:10 AM, Nikolay Izhikov <nizhi...@apache.org> wrote:
> > Valentin,
> > 
> > > Looks like this is because you enabled non-collocated joins
> > 
> > But non-collocated joins is only way to be sure that join returns correct 
> > results.
> > So in my case it's OK to enable them.
> > Am I right?
> > 
> > > do we have this documented somewhere?
> > 
> > I'm asked that in previuous mail.
> > Vladimir Ozerov give me an answer [1] I quoted for you:
> > 
> > > Unfortunately, at this moment we do not have complete list of all 
> > > restrictions on our joins, because a lot of work is delegated to H2.
> > > In some unsupported scenarios we throw an exception.
> > > In other cases we return incorrect results silently (e.g. if you do not 
> > > co-locate data and forgot to set "distributed joins" flag).
> > > We have a plan to perform excessive testing of joins (both co-located and 
> > > distributed) and list all known limitations.
> > > This would require writing a lot of unit tests to cover various scenarios.
> > > I think we will have this information in a matter of 1-2 months.
> > 
> > So the answer is no, we haven't documentation for a join limitations.
> > 
> > That's why I propose to exclude join optimization from my PR until:
> > 
> > 1. We create documentation for all join limitations.
> > 2. Create the way to check is certain join satisfy current limitations.
> > 
> > [1] 
> > http://apache-ignite-developers.2346864.n4.nabble.com/SparkDataFrame-Query-Optimization-Prototype-tp26249p26361.html
> > 
> > В Вт, 13/02/2018 в 09:55 -0800, Valentin Kulichenko пишет:
> > > Nikolay,
> > >
> > > Looks like this is because you enabled non-collocated joins. I was not
> > > aware of this limitation though, do we have this documented somewhere?
> > >
> > > -Val
> > >
> > > On Tue, Feb 13, 2018 at 8:21 AM, Nikolay Izhikov <nizhi...@apache.org>
> > > wrote:
> > >
> > > > Val,
> > > >
> > > > Source code check: https://github.com/apache/ignite/blob/master/modules/
> > > > indexing/src/main/java/org/apache/ignite/internal/processors/query/h2/opt/
> > > > GridH2CollocationModel.java#L382
> > > >
> > > > Stack trace:
> > > >
> > > > javax.cache.CacheException: Failed to prepare distributed join query: 
> > > > join
> > > > condition does not use index [joinedCache=SQL_PUBLIC_JT2, plan=SELECT
> > > >     __Z0.ID AS __C0_0,
> > > >     __Z0.VAL1 AS __C0_1,
> > > >     __Z1.ID AS __C0_2,
> > > >     __Z1.VAL2 AS __C0_3
> > > > FROM PUBLIC.JT1 __Z0
> > > >     /* PUBLIC.JT1.__SCAN_ */
> > > > INNER JOIN PUBLIC.JT2 __Z1
> > > >     /* batched:broadcast PUBLIC.JT2.__SCAN_ */
> > > >     ON 1=1
> > > > WHERE __Z0.VAL1 = __Z1.VAL2]
> > > >         at org.apache.ignite.internal.processors.query.h2.opt.
> > > > GridH2CollocationModel.joinedWithCollocated(GridH2CollocationModel.java:
> > > > 384)
> > > >         at org.apache.ignite.internal.processors.query.h2.opt.
> > > > GridH2CollocationModel.calculate(GridH2CollocationModel.java:308)
> > > >         at org.apache.ignite.internal.processors.query.h2.opt.
> > > > GridH2CollocationModel.type(GridH2CollocationModel.java:549)
> > > >         at org.apache.ignite.internal.processors.query.h2.opt.
> > > > GridH2CollocationModel.calculate(GridH2CollocationModel.java:257)
> > > >         at org.apache.ignite.internal.processors.query.h2.opt.
> > > > GridH2CollocationModel.type(GridH2CollocationModel.java:549)
> > > >         at org.apache.ignite.internal.processors.query.h2.opt.
> > > > GridH2CollocationModel.isCollocated(GridH2CollocationModel.java:691)
> > > >         at org.apache.ignite.internal.processors.query.h2.sql.
> > > > GridSqlQuerySplitter.split(GridSqlQuerySplitter.java:239)
> > > >         at org.apache.ignite.internal.processors.query.h2.
> > > > IgniteH2Indexing.split(IgniteH2Indexing.java:1856)
> > > >         at org.apache.ignite.internal.processors.query.h2.
> > > > IgniteH2Indexing.parseAndSplit(IgniteH2Indexing.java:1818)
> > > >         at org.apache.ignite.internal.processors.query.h2.
> > > > IgniteH2Indexing.querySqlFields(IgniteH2Indexing.java:1569)
> > > >         at org.apache.ignite.internal.processors.query.
> > > > GridQueryProcessor$4.applyx(GridQueryProcessor.java:2037)
> > > >         at org.apache.ignite.internal.processors.query.
> > > > GridQueryProcessor$4.applyx(GridQueryProcessor.java:2032)
> > > >         at org.apache.ignite.internal.util.lang.IgniteOutClosureX.
> > > > apply(IgniteOutClosureX.java:36)
> > > >         at 
> > > > org.apache.ignite.internal.processors.query.GridQueryProcessor.
> > > > executeQuery(GridQueryProcessor.java:2553)
> > > >         at 
> > > > org.apache.ignite.internal.processors.query.GridQueryProcessor.
> > > > querySqlFields(GridQueryProcessor.java:2046)
> > > >         at org.apache.ignite.internal.processors.cache.
> > > > IgniteCacheProxyImpl.query(IgniteCacheProxyImpl.java:664)
> > > >         at org.apache.ignite.internal.processors.cache.
> > > > IgniteCacheProxyImpl.query(IgniteCacheProxyImpl.java:615)
> > > >         at org.apache.ignite.internal.processors.cache.
> > > > GatewayProtectedCacheProxy.query(GatewayProtectedCacheProxy.java:382)
> > > >         at org.apache.ignite.spark.JoinTestSpec.execSQL(
> > > > JoinTestSpec.scala:63)
> > > >
> > > >
> > > > В Вт, 13/02/2018 в 08:12 -0800, Valentin Kulichenko пишет:
> > > > > Nikolay,
> > > > >
> > > > > This doesn't make sense to me. Not having an index should not cause
> > > >
> > > > query to fail. What is the exception?
> > > > >
> > > > > -Val
> > > > >
> > > > > On Tue, Feb 13, 2018 at 8:07 AM, Nikolay Izhikov <nizhi...@apache.org>
> > > >
> > > > wrote:
> > > > > > Hello, Valentin.
> > > > > >
> > > > > > > When you're talking about join optimization, what exactly are you
> > > >
> > > > referring to?
> > > > > >
> > > > > > I'm referring to my PR [1]
> > > > > > Currently, it contains transformation from Spark joins to Ignite 
> > > > > > joins
> > > >
> > > > [2]
> > > > > >
> > > > > > But, if I understand Vladimir answer right, for now, we don't 
> > > > > > *fully*
> > > >
> > > > support SQL join queries.
> > > > > >
> > > > > > Sometimes it will work just right, in other cases, it will throw an
> > > >
> > > > exception due Ignite internal implementation.
> > > > > >
> > > > > > Please, see my example [3].
> > > > > > Query from line 4 will throw an exception.
> > > > > > The same query from line 10 will succeed, because of index creation.
> > > > > >
> > > > > > Both of them syntactically correct.
> > > > > >
> > > > > > > Unfortunately, at this moment we do not have complete list of all
> > > >
> > > > restrictions on our joins, because a lot of work is delegated to H2.
> > > > > > > In some unsupported scenarios we throw an exception.
> > > > > > > In other cases we return incorrect results silently (e.g. if you 
> > > > > > > do
> > > >
> > > > not co-locate data and forgot to set "distributed joins" flag).
> > > > > > > We have a plan to perform excessive testing of joins (both
> > > >
> > > > co-located and distributed) and list all known limitations.
> > > > > > > This would require writing a lot of unit tests to cover various
> > > >
> > > > scenarios.
> > > > > > > I think we will have this information in a matter of 1-2 months.
> > > > > >
> > > > > > [1] https://github.com/apache/ignite/pull/3397
> > > > > > [2] https://github.com/apache/ignite/pull/3397/files#diff-
> > > >
> > > > 5a861613530bbce650efa50d553a0e92R227
> > > > > > [3] 
> > > > > > https://gist.github.com/nizhikov/a4389fd78636869dd38c13920b5baf2b
> > > > > >
> > > > > > В Пн, 12/02/2018 в 13:45 -0800, Valentin Kulichenko пишет:
> > > > > > > Nikolay,
> > > > > > >
> > > > > > > When you're talking about join optimization, what exactly are you
> > > >
> > > > referring to?
> > > > > > >
> > > > > > > Since other parts of data frames integration are already merged, I
> > > >
> > > > think it's a good time to resurrect this thread? Does it make sense to
> > > > review it right now? Or you want to make some more changes?
> > > > > > >
> > > > > > > -Val
> > > > > > >
> > > > > > > On Mon, Feb 12, 2018 at 12:20 AM, Vladimir Ozerov <
> > > >
> > > > voze...@gridgain.com> wrote:
> > > > > > > > Hi Nikolay,
> > > > > > > >
> > > > > > > > I am not sure if ticket for DECIMAL column metadata exists. If 
> > > > > > > > you
> > > >
> > > > haven't find one under "sql" component, please feel free to create it on
> > > > your own. As far as testing of joins, I think it makes sense to start
> > > > working on it when we finish ANSI compliance testing which is already in
> > > > progress.
> > > > > > > >
> > > > > > > > On Wed, Jan 24, 2018 at 12:27 PM, Nikolay Izhikov <
> > > >
> > > > nizhikov....@gmail.com> wrote:
> > > > > > > > > Hello, Vladimir.
> > > > > > > > >
> > > > > > > > > Thank you for an answer.
> > > > > > > > >
> > > > > > > > > > Do you mean whether it is possible to read it from table
> > > >
> > > > metadata?
> > > > > > > > >
> > > > > > > > > Yes, you are right.
> > > > > > > > > I want to read scale and precision of DECIMAL column from 
> > > > > > > > > table
> > > >
> > > > metadata.
> > > > > > > > >
> > > > > > > > > > This will be fixed at some point in future, but I do not 
> > > > > > > > > > have
> > > >
> > > > any dates at the moment.
> > > > > > > > >
> > > > > > > > > Is there ticket for it? I can't find it via jira search
> > > > > > > > >
> > > > > > > > > > at this moment we do not have complete list of all
> > > >
> > > > restrictions on our joins, because a lot of work is delegated to H2.
> > > > > > > > > > In some unsupported scenarios we throw an exception.
> > > > > > > > > > In other cases we return incorrect results silently (e.g. if
> > > >
> > > > you do not co-locate data and forgot to set "distributed joins" flag).
> > > > > > > > >
> > > > > > > > > Guys, Val, may be we should exclude join optimization from
> > > >
> > > > IGNITE-7077 while we haven't all limitation on the hand?
> > > > > > > > >
> > > > > > > > > > We have a plan to perform excessive testing of joins (both
> > > >
> > > > co-located and distributed) and list all known limitations.
> > > > > > > > >
> > > > > > > > > Can I help somehow with this activity?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > В Ср, 24/01/2018 в 12:08 +0300, Vladimir Ozerov пишет:
> > > > > > > > > > Hi Nikolay,
> > > > > > > > > >
> > > > > > > > > > Could you please clarify your question about scale and
> > > >
> > > > precision? Do you mean whether it is possible to read it from table
> > > > metadata? If yes, it is not possible at the moment unfortunately - we do
> > > > not store information about lengths, scales and precision, only actual 
> > > > data
> > > > types are passed to H2 (e.g. String, BigDecimal, etc.). This will be 
> > > > fixed
> > > > at some point in future, but I do not have any dates at the moment.
> > > > > > > > > >
> > > > > > > > > > Now about joins - Denis, I think you provided wrong link to
> > > >
> > > > our internal GridGain docs where we accumulate information about ANSI
> > > > compatibility and which will are going to publish on Ignite WIKI when 
> > > > it is
> > > > ready. In any case, this is not what Nikolay aksed about. The question 
> > > > was
> > > > about limitation of our joins which has nothing to do with ANSI 
> > > > standard.
> > > > Unfortunately, at this moment we do not have complete list of all
> > > > restrictions on our joins, because a lot of work is delegated to H2. In
> > > > some unsupported scenarios we throw an exception. In other cases we 
> > > > return
> > > > incorrect results silently (e.g. if you do not co-locate data and 
> > > > forgot to
> > > > set "distributed joins" flag). We have a plan to perform excessive 
> > > > testing
> > > > of joins (both co-located and distributed) and list all known 
> > > > limitations.
> > > > This would require writing a lot of unit tests to cover various 
> > > > scenarios.
> > > > I think we will have this information in a matter of 1-2 months.
> > > > > > > > > >
> > > > > > > > > > Vladimir.
> > > > > > > > > >
> > > > > > > > > > On Tue, Jan 23, 2018 at 11:45 PM, Denis Magda <
> > > >
> > > > dma...@apache.org> wrote:
> > > > > > > > > > > Agree. The unsupported functions should be mentioned on 
> > > > > > > > > > > the
> > > >
> > > > page that will cover Ignite ANSI-99 compliance. We have first results
> > > > available for CORE features of the specification:
> > > > > > > > > > > https://ggsystems.atlassian.net/wiki/spaces/GG/pages/
> > > >
> > > > 45093646/ANSI+SQL+99 <https://ggsystems.atlassian.
> > > > net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99>
> > > > > > > > > > >
> > > > > > > > > > > That’s on my radar. I’ll take care of this.
> > > > > > > > > > >
> > > > > > > > > > > —
> > > > > > > > > > > Denis
> > > > > > > > > > >
> > > > > > > > > > > > On Jan 23, 2018, at 10:31 AM, Dmitriy Setrakyan <
> > > >
> > > > dsetrak...@apache.org> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > I think we need a page listing the unsupported functions
> > > >
> > > > with explanation
> > > > > > > > > > > > why, which is either it does not make sense in Ignite or
> > > >
> > > > is planned in
> > > > > > > > > > > > future release.
> > > > > > > > > > > >
> > > > > > > > > > > > Sergey, do you think you will be able to do it?
> > > > > > > > > > > >
> > > > > > > > > > > > D.
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Jan 23, 2018 at 12:05 AM, Serge Puchnin <
> > > >
> > > > sergey.puch...@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > yes, the Cust function is supporting both Ignite and 
> > > > > > > > > > > > > H2.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I've updated the documentation for next system 
> > > > > > > > > > > > > functions:
> > > > > > > > > > > > > CASEWHEN Function, CAST, CONVERT, TABLE
> > > > > > > > > > > > >
> > > > > > > > > > > > > https://apacheignite-sql.readme.io/docs/system-functions
> > > > > > > > > > > > >
> > > > > > > > > > > > > And for my mind, next functions aren't applicable for
> > > >
> > > > Ignite:
> > > > > > > > > > > > > ARRAY_GET, ARRAY_LENGTH, ARRAY_CONTAINS, CSVREAD,
> > > >
> > > > CSVWRITE, DATABASE,
> > > > > > > > > > > > > DATABASE_PATH, DISK_SPACE_USED, FILE_READ, FILE_WRITE,
> > > >
> > > > LINK_SCHEMA,
> > > > > > > > > > > > > MEMORY_FREE, MEMORY_USED, LOCK_MODE, LOCK_TIMEOUT,
> > > >
> > > > READONLY, CURRVAL,
> > > > > > > > > > > > > AUTOCOMMIT, CANCEL_SESSION, IDENTITY, NEXTVAL, ROWNUM,
> > > >
> > > > SCHEMA,
> > > > > > > > > > > > > SCOPE_IDENTITY, SESSION_ID, SET, TRANSACTION_ID,
> > > >
> > > > TRUNCATE_VALUE, USER,
> > > > > > > > > > > > > H2VERSION
> > > > > > > > > > > > >
> > > > > > > > > > > > > Also an issue was created for review current
> > > >
> > > > documentation:
> > > > > > > > > > > > > https://issues.apache.org/jira/browse/IGNITE-7496
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > BR,
> > > > > > > > > > > > > Serge
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Sent from: http://apache-ignite-
> > > >
> > > > developers.2346864.n4.nabble.com/
> > > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > > > > >
> > > > >
> > > > >
> 
> 

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to