You should probably start a new discussion with this title.
I don't know if this can be done.

Mihai

-----Original Message-----
From: P.F. ZHAN 
Sent: Tuesday, August 08, 2023 4:29 PM
To: [email protected]
Subject: Re: [Discussion] Split function has some wrong and strange behaviors

Hi mbudiu, thanks for your patience. Yes, a[x] interpret as a[OFFSET(x)] for 
BigQuery. I tried this, and it is okay too.


On Wed, Aug 9, 2023 at 02:48 <[email protected]> wrote:

> This has nothing to do with the "Split" function. It is all about 
> indexing in arrays.
>
> The list of Dialect-specific operators in calcite includes OFFSET and 
> ORDINAL for BigQuery.
> https://calcite.apache.org/docs/reference.html#dialect-specific-operat
> ors
>
> So you want to be able to write a[x] and it should be interpreted as 
> a[OFFSET (x)] when using BigQuery?
>
> Mihai
>
> -----Original Message-----
> From: P.F. ZHAN
> Sent: Tuesday, August 08, 2023 11:37 AM
> To: [email protected]
> Subject: Re: [Discussion] Split function has some wrong and strange 
> behaviors
>
> Hi, mbudiu.
>
> f.checkScalar can not reproduce this problem, it happens when 
> converting SqlNode to RelNode.
> Maybe you could check this test case:
>
> > @Test void testIndexWithoutOperatorBigQuery() { String query = 
> > "SELECT SPLIT('h,e,l,l,o')[1] FROM \"employee\""; String error = 
> > "BigQuery requires an array subscript operator to index an array"; 
> > sql(query).withBigQuery().withLibrary(SqlLibrary.BIG_QUERY).throws_(
> > er
> > ror)
> > ;
> > }
> >
>
> I checked the doc reference of Calcite, the array index does start with 1.
> However, for BigQuery, if omit the array subscript operators, it is 
> equivalent to offset(0), that is to say split('aaa-bbb-ccc', '-')[0] 
> is equivalent to split('aaa-bbb-ccc', '-')[offset(0)]. The usage of 
> omitting the array subscript operators doesn't appear on the BigQuery 
> doc, but it works in this query engine.
>
> This seems by design for Calcite. However, my question is "can we 
> remove this limitation and support the usage of omitting the array 
> subscript operators to get the same result with BigQuery?" In other 
> words, split('aaa-bbb-ccc', '-')[offset(0)] could be replaced by 
> split('aaa-bbb-ccc', '-')[0] ?
>
> On Wed, Aug 9, 2023 at 2:13 AM <[email protected]> wrote:
>
> > I added the following tests to SqlOperatorTest and they passed:
> >
> > f.checkScalar("split('aaa-bbb-ccc', '-')[1]", "aaa", "VARCHAR"); 
> > f.checkNull("split('aaa-bbb-ccc', '-')[0]");
> >
> > Please note that in Calcite SQL the array indexes start from 1:
> > https://calcite.apache.org/docs/reference.html#value-constructors
> >
> > If you try to index out of bounds (with index 0) you will get a null 
> > value. In BigQuery it seems to be illegal to index with an integer 
> > https://cloud.google.com/bigquery/docs/arrays, you seem to need to 
> > use either OFFSET or ORDINAL.
> >
> > But I don't see any exception, certainly not a parse exception.
> >
> > I don't understand what you mean by "Calcite does not support split 
> > function without array subscript operators."
> >
> > Mihai
> >
> > -----Original Message-----
> > From: P.F. ZHAN
> > Sent: Tuesday, August 08, 2023 10:53 AM
> > To: [email protected]
> > Subject: Re: [Discussion] Split function has some wrong and strange 
> > behaviors
> >
> > Sorry,
> > I executed the following four queries in BigQuery, and the results 
> > displayed after the token "=>"
> >
> > query1. select split('aaa-bbb-ccc', '-')[0] => aaa query2. select 
> > split('aaa-bbb-ccc', '-')[offset(0)] => aaa query3. select 
> > split('aaa-bbb-ccc', '-')[3] => Array index 3 is out of bounds
> > (overflow) query4. select split('aaa-bbb-ccc', '-')[offset(3)] => 
> > Array index 3 is out of bounds (overflow)
> >
> > If we use Calcite to parse  query1 and query3, it will throw exceptions.
> > That's to say, Calcite does not support split function without array 
> > subscript operators.
> > I wonder if we should support this kind of usage of this function?
> >
> >
> >
> > On Wed, Aug 9, 2023 at 1:23 AM <[email protected]> wrote:
> >
> > > The mailing list strips images from emails.
> > >
> > >
> > >
> > > Mihai
> > >
> > >
> > >
> > > From: P.F. ZHAN
> > > Sent: Tuesday, August 08, 2023 10:23 AM
> > > To: [email protected]
> > > Subject: Re: [Discussion] Split function has some wrong and 
> > > strange behaviors
> > >
> > >
> > >
> > > Sorry, maybe I didn't describe it very well, which caused you to 
> > > not understand what I meant.
> > >
> > > Here, I gives some query results to describe this problem.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > Calcite does not support split function without array subscript 
> > > operator, I wonder should we support this kind of usage of this
> function?
> > >
> > >
> > >
> > >
> > >
> > > The BigQuery doc:
> > >
> > >
> > > https://cloud.google.com/bigquery/docs/reference/standard-sql/oper
> > > at
> > > or
> > > s#array_subscript_operator
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Aug 9, 2023 at 12:43 AM <[email protected] 
> > > <mailto:[email protected]>
> > > > wrote:
> > >
> > > I don' t understand from this message what the problem with split is.
> > >
> > > Mihai
> > >
> > > -----Original Message-----
> > > From: P.F. ZHAN
> > > Sent: Tuesday, August 08, 2023 8:21 AM
> > > To: [email protected] <mailto:[email protected]>
> > > Subject: Re: [Discussion] Split function has some wrong and 
> > > strange behaviors
> > >
> > > I rechecked Calcite's implementation of the split function today 
> > > again, and I found that this behavior appears to be a by design
> behavior.
> > > >
> > > >
> > > > org.apache.calcite.rel.rel2sql.RelToSqlConverterTest#testIndexOp
> > > > er
> > > > at
> > > > or
> > > > sBigQuery
> > >
> > >
> > > org.apache.calcite.rel.rel2sql.RelToSqlConverterTest#testIndexWith
> > > ou
> > > tO
> > > peratorBigQuery
> > >
> > >
> > > I also checked the BigQuery doc which only mentions these standard 
> > > usages in our test case of *testIndexOperatorsBigQuery*, but in 
> > > fact when we use SQL like
> > >
> > > > select split('aaa-bbb-ccc', '-')[0]
> > >
> > > the results are available in BigQuery and the behavior is the same 
> > > as
> > >
> > > > select split('aaa-bbb-ccc', '-')[ offset(0)]
> > >
> > >  Maybe Calcite should support this kind of usage rather than treat 
> > > it as a limitation.
> > >
> > >
> > >
> > > On Tue, Aug 8, 2023 at 10:31 AM P.F. ZHAN <[email protected] <mailto:
> > > [email protected]> > wrote:
> > >
> > > > Thank you mbudiu, Ran Tao and Lake Shen.
> > > >
> > > > I have tried your suggestions and debugged the code, this kind 
> > > > of sql seems not support
> > > >
> > > >> select split('aaa-bbb-ccc', '-')[1]
> > > >>
> > > >
> > > > I have checked it in both Bigquery and Spark, it works. This 
> > > > looks like a bug at present. Can I open a jira issue to follow 
> > > > this
> problem?
> > > >
> > > >
> > > > On Tue, Aug 8, 2023 at 1:09 AM <[email protected] <mailto:
> > > [email protected]> > wrote:
> > > >
> > > >> See also this thread:
> > > >> https://issues.apache.org/jira/browse/CALCITE-5760
> > > >>
> > > >> You can run CalciteSqlOperatorTest instead.
> > > >> If you want to quickly debug only this test in the IDE you can 
> > > >> copy-paste this test into the CalciteSqlOperatorTest class and 
> > > >> run it from there, then delete it when you are done.
> > > >>
> > > >> Mihai
> > > >>
> > > >> -----Original Message-----
> > > >> From: Ran Tao
> > > >> Sent: Monday, August 07, 2023 4:04 AM
> > > >> To: [email protected] <mailto:[email protected]>
> > > >> Subject: Re: [Discussion] Split function has some wrong and 
> > > >> strange behaviors
> > > >>
> > > >> > I found that when I use "*./gradlew  build*" to build the 
> > > >> > project happens
> > > >> with some errors, but I execute this case in IDEA alone and it 
> > > >> is successful.
> > > >>
> > > >> The default `fixture()` does not execute in unit tests (but 
> > > >> works in ./gradlew build), you should use
> > > >> Fixture.forOperators(true) to execute in unit tests. You can 
> > > >> check this usage, hope it helps you.
> > > >>
> > > >>
> > > >> Best Regards,
> > > >> Ran Tao
> > > >> https://github.com/chucheng92
> > > >>
> > > >>
> > > >> P.F. ZHAN <[email protected] <mailto:[email protected]> >
> > > 于2023年8月7日周一 15:22写道:
> > > >>
> > > >> > Sorry, I made a mistake,  used Chinese to describe my question.
> > > >> >
> > > >> > Hi, When I was testing the split function with a  case as follows:
> > > >> > org.apache.calcite.test.SqlOperatorTest#testSplitFunction
> > > >> >
> > > >> > > f.checkScalar("SPLIT('h-e-l-l-o', '-')[0]", "h", 
> > > >> > > "VARCHAR");
> > > >> > >
> > > >> >
> > > >> >  I found that when I use "*./gradlew  build*" to build the 
> > > >> > project happens with some errors, but I execute this case in 
> > > >> > IDEA alone and it is successful.
> > > >> >
> > > >> > FAILURE   0.3sec, org.apache.calcite.test.CalciteSqlOperatorTest >
> > > >> > > testSplitFunction()
> > > >> > >     java.lang.AssertionError: Query: values 
> > > >> > > (SPLIT('h-e-l-l-o',
> > > >> '-')[0])
> > > >> > >     Expected: is "h"
> > > >> > >          but: was null
> > > >> > >         at
> > > >> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
> > > >> > >         at
> > > >> > >
> > > >> > org.apache.calcite.sql.test.ResultCheckers.compareResultSetWi
> > > >> > th
> > > >> > Ma
> > > >> > tc
> > > >> > her
> > > >> > (ResultCheckers.java:252)
> > > >> > >         at
> > > >> > >
> > > >> > org.apache.calcite.sql.test.ResultCheckers$MatcherResultCheck
> > > >> > er
> > > >> > .c
> > > >> > he
> > > >> > ckR
> > > >> > esult(ResultCheckers.java:321)
> > > >> > >         at
> > > >> > >
> > > >> > org.apache.calcite.test.SqlOperatorTest$TesterImpl.check(SqlO
> > > >> > pe
> > > >> > ra
> > > >> > to
> > > >> > rTe
> > > >> > st.java:12150)
> > > >> > >         at
> > > >> > org.apache.calcite.sql.test.SqlTester.check(SqlTester.java:16
> > > >> > 0)
> > > >> > >         at
> > > >> > >
> > > >> > org.apache.calcite.test.SqlOperatorFixtureImpl.lambda$checkSc
> > > >> > al
> > > >> > ar
> > > >> > $2
> > > >> > (Sq
> > > >> > lOperatorFixtureImpl.java:224)
> > > >> > >         at
> > > >> > >
> > > >> > org.apache.calcite.sql.test.AbstractSqlTester.forEachQuery(Ab
> > > >> > st
> > > >> > ra
> > > >> > ct
> > > >> > Sql
> > > >> > Tester.java:446)
> > > >> > >         at
> > > >> > >
> > > >> > org.apache.calcite.test.SqlOperatorFixtureImpl.checkScalar(Sq
> > > >> > lO
> > > >> > pe
> > > >> > ra
> > > >> > tor
> > > >> > FixtureImpl.java:223)
> > > >> > >         at
> > > >> > >
> > > >> > org.apache.calcite.sql.test.SqlOperatorFixture.checkScalar(Sq
> > > >> > lO
> > > >> > pe
> > > >> > ra
> > > >> > tor
> > > >> > Fixture.java:238)
> > > >> > >         at
> > > >> > >
> > > >> > org.apache.calcite.test.SqlOperatorTest.testSplitFunction(Sql
> > > >> > Op
> > > >> > er
> > > >> > at
> > > >> > orT
> > > >> > est.java:7657)
> > > >> > >         at
> > > >> > >
> > > >> > org.junit.platform.commons.util.ReflectionUtils.invokeMethod(
> > > >> > Re
> > > >> > fl
> > > >> > ec
> > > >> > tio
> > > >> > nUtils.java:727)
> > > >> > >
> > > >> >
> > > >> > I also add a case in the file big-query.iq 
> > > >> > <http://big-query.iq>
> > > >> >
> > > >> > SELECT SPLIT("h,e,l,l,o", ',')[1] as result;
> > > >> > > +------- +
> > > >> > > | result |
> > > >> > > +------- +
> > > >> > > | e |
> > > >> > > +------- +
> > > >> > > (1 row)
> > > >> > >
> > > >> > > !ok
> > > >> > >
> > > >> >
> > > >> > The error log is as follows:
> > > >> >
> > > >> > > FAILURE  27.6sec, org.apache.calcite.test.BabelQuidemTest > 
> > > >> > > test(String)[3], [3] sql/big-query.iq <http://big-query.iq>
> > > >> > >     org.opentest4j.AssertionFailedError: Files differ:
> > > >> > > ~/calcite/babel/build/quidem/test/sql/big-query.iq <
> > > http://big-query.iq>
> > > >> > >  ~/calcite/babel/build/resources/test/sql/big-query.iq <
> > > http://big-query.iq>
> > > >> > >     759c759
> > > >> > >     < +------- +
> > > >> > >     ---
> > > >> > >     > +--------+
> > > >> > >     761,763c761,763
> > > >> > >     < +------- +
> > > >> > >     < | e      |
> > > >> > >     < +------- +
> > > >> > >     ---
> > > >> > >     > +--------+
> > > >> > >     > | h      |
> > > >> > >     > +--------+
> > > >> > >
> > > >> > >         at
> > > >> > >
> org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:38)
> > > >> > >         at
> > > org.junit.jupiter.api.Assertions.fail(Assertions.java:135)
> > > >> > >         at
> > > >> > org.apache.calcite.test.QuidemTest.checkRun(QuidemTest.java:1
> > > >> > 83
> > > >> > )
> > > >> > >         at
> > > >> > > org.apache.calcite.test.QuidemTest.test(QuidemTest.java:236
> > > >> > > )
> > > >> > >
> > > >> > >
> > > >> > On Mon, Aug 7, 2023 at 3:13 PM P.F. ZHAN <[email protected]
> > <mailto:
> > > [email protected]> > wrote:
> > > >> >
> > > >> > > 大家好。我在测试 split 函数功能的时候,添加了个 case 
> > > >> > > org.apache.calcite.test.SqlOperatorTest#testSplitFunction
> > > >> > >
> > > >> > >> f.checkScalar("SPLIT('h-e-l-l-o', '-')[0]", "h", 
> > > >> > >> "VARCHAR");
> > > >> > >>
> > > >> > >
> > > >> > > 我发现我使用 gradlew build 整个项目时会出错,但我单独在 IDEA 里面执行这个 case 是成功的。
> > > >> > >
> > > >> > >> FAILURE   0.3sec,
> org.apache.calcite.test.CalciteSqlOperatorTest
> > >
> > > >> > >> testSplitFunction()
> > > >> > >>     java.lang.AssertionError: Query: values 
> > > >> > >> (SPLIT('h-e-l-l-o',
> > > >> '-')[0])
> > > >> > >>     Expected: is "h"
> > > >> > >>          but: was null
> > > >> > >>         at
> > > >> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
> > > >> > >>         at
> > > >> > >>
> > > >> > org.apache.calcite.sql.test.ResultCheckers.compareResultSetWi
> > > >> > th
> > > >> > Ma
> > > >> > tc
> > > >> > her
> > > >> > (ResultCheckers.java:252)
> > > >> > >>         at
> > > >> > >>
> > > >> > org.apache.calcite.sql.test.ResultCheckers$MatcherResultCheck
> > > >> > er
> > > >> > .c
> > > >> > he
> > > >> > ckR
> > > >> > esult(ResultCheckers.java:321)
> > > >> > >>         at
> > > >> > >>
> > > >> > org.apache.calcite.test.SqlOperatorTest$TesterImpl.check(SqlO
> > > >> > pe
> > > >> > ra
> > > >> > to
> > > >> > rTe
> > > >> > st.java:12150)
> > > >> > >>         at
> > > >> > org.apache.calcite.sql.test.SqlTester.check(SqlTester.java:16
> > > >> > 0)
> > > >> > >>         at
> > > >> > >>
> > > >> > org.apache.calcite.test.SqlOperatorFixtureImpl.lambda$checkSc
> > > >> > al
> > > >> > ar
> > > >> > $2
> > > >> > (Sq
> > > >> > lOperatorFixtureImpl.java:224)
> > > >> > >>         at
> > > >> > >>
> > > >> > org.apache.calcite.sql.test.AbstractSqlTester.forEachQuery(Ab
> > > >> > st
> > > >> > ra
> > > >> > ct
> > > >> > Sql
> > > >> > Tester.java:446)
> > > >> > >>         at
> > > >> > >>
> > > >> > org.apache.calcite.test.SqlOperatorFixtureImpl.checkScalar(Sq
> > > >> > lO
> > > >> > pe
> > > >> > ra
> > > >> > tor
> > > >> > FixtureImpl.java:223)
> > > >> > >>         at
> > > >> > >>
> > > >> > org.apache.calcite.sql.test.SqlOperatorFixture.checkScalar(Sq
> > > >> > lO
> > > >> > pe
> > > >> > ra
> > > >> > tor
> > > >> > Fixture.java:238)
> > > >> > >>         at
> > > >> > >>
> > > >> > org.apache.calcite.test.SqlOperatorTest.testSplitFunction(Sql
> > > >> > Op
> > > >> > er
> > > >> > at
> > > >> > orT
> > > >> > est.java:7657)
> > > >> > >>         at
> > > >> > >>
> > > >> > org.junit.platform.commons.util.ReflectionUtils.invokeMethod(
> > > >> > Re
> > > >> > fl
> > > >> > ec
> > > >> > tio
> > > >> > nUtils.java:727)
> > > >> > >>
> > > >> > >
> > > >> > > 我在big-query.iq <http://big-query.iq>  中加入下面这段
> > > >> > >
> > > >> > >> SELECT SPLIT("h,e,l,l,o", ',')[1] as result;
> > > >> > >> +------- +
> > > >> > >> | result |
> > > >> > >> +------- +
> > > >> > >> | e |
> > > >> > >> +------- +
> > > >> > >> (1 row)
> > > >> > >>
> > > >> > >> !ok
> > > >> > >>
> > > >> > >
> > > >> > > 报错如下:
> > > >> > >
> > > >> > >> FAILURE  27.6sec, org.apache.calcite.test.BabelQuidemTest 
> > > >> > >> > test(String)[3], [3] sql/big-query.iq <http://big-query.iq>
> > > >> > >>     org.opentest4j.AssertionFailedError: Files differ:
> > > >> > >> ~/calcite/babel/build/quidem/test/sql/big-query.iq <
> > > http://big-query.iq>
> > > >> > >> ~/calcite/babel/build/resources/test/sql/big-query.iq <
> > > http://big-query.iq>
> > > >> > >>     759c759
> > > >> > >>     < +------- +
> > > >> > >>     ---
> > > >> > >>     > +--------+
> > > >> > >>     761,763c761,763
> > > >> > >>     < +------- +
> > > >> > >>     < | e      |
> > > >> > >>     < +------- +
> > > >> > >>     ---
> > > >> > >>     > +--------+
> > > >> > >>     > | h      |
> > > >> > >>     > +--------+
> > > >> > >>
> > > >> > >>         at
> > > >> > >>
> org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:38)
> > > >> > >>         at
> > > org.junit.jupiter.api.Assertions.fail(Assertions.java:135)
> > > >> > >>         at
> > > >> > >>
> org.apache.calcite.test.QuidemTest.checkRun(QuidemTest.java:183)
> > > >> > >>         at
> > > >> > >> org.apache.calcite.test.QuidemTest.test(QuidemTest.java:23
> > > >> > >> 6)
> > > >> > >>
> > > >> > >
> > > >> >
> > > >>
> > > >>
> > >
> > >
> >
> >
>
>

Reply via email to