I added the following tests to SqlOperatorTest and they passed:
f.checkScalar("split('aaa-bbb-ccc', '-')[1]", "aaa", "VARCHAR");
f.checkNull("split('aaa-bbb-ccc', '-')[0]");
Please note that in Calcite SQL the array indexes start from 1:
https://calcite.apache.org/docs/reference.html#value-constructors
If you try to index out of bounds (with index 0) you will get a null value. In
BigQuery it seems to be illegal to index with an integer
https://cloud.google.com/bigquery/docs/arrays, you seem to need to use either
OFFSET or ORDINAL.
But I don't see any exception, certainly not a parse exception.
I don't understand what you mean by "Calcite does not support split function
without array subscript operators."
Mihai
-----Original Message-----
From: P.F. ZHAN
Sent: Tuesday, August 08, 2023 10:53 AM
To: [email protected]
Subject: Re: [Discussion] Split function has some wrong and strange behaviors
Sorry,
I executed the following four queries in BigQuery, and the results displayed
after the token "=>"
query1. select split('aaa-bbb-ccc', '-')[0] => aaa query2. select
split('aaa-bbb-ccc', '-')[offset(0)] => aaa query3. select split('aaa-bbb-ccc',
'-')[3] => Array index 3 is out of bounds (overflow) query4. select
split('aaa-bbb-ccc', '-')[offset(3)] => Array index 3 is out of bounds
(overflow)
If we use Calcite to parse query1 and query3, it will throw exceptions.
That's to say, Calcite does not support split function without array subscript
operators.
I wonder if we should support this kind of usage of this function?
On Wed, Aug 9, 2023 at 1:23 AM <[email protected]> wrote:
> The mailing list strips images from emails.
>
>
>
> Mihai
>
>
>
> From: P.F. ZHAN
> Sent: Tuesday, August 08, 2023 10:23 AM
> To: [email protected]
> Subject: Re: [Discussion] Split function has some wrong and strange
> behaviors
>
>
>
> Sorry, maybe I didn't describe it very well, which caused you to not
> understand what I meant.
>
> Here, I gives some query results to describe this problem.
>
>
>
>
>
>
>
>
>
>
> Calcite does not support split function without array subscript
> operator, I wonder should we support this kind of usage of this function?
>
>
>
>
>
> The BigQuery doc:
>
>
> https://cloud.google.com/bigquery/docs/reference/standard-sql/operator
> s#array_subscript_operator
>
>
>
>
>
> On Wed, Aug 9, 2023 at 12:43 AM <[email protected]
> <mailto:[email protected]>
> > wrote:
>
> I don' t understand from this message what the problem with split is.
>
> Mihai
>
> -----Original Message-----
> From: P.F. ZHAN
> Sent: Tuesday, August 08, 2023 8:21 AM
> To: [email protected] <mailto:[email protected]>
> Subject: Re: [Discussion] Split function has some wrong and strange
> behaviors
>
> I rechecked Calcite's implementation of the split function today
> again, and I found that this behavior appears to be a by design behavior.
> >
> >
> > org.apache.calcite.rel.rel2sql.RelToSqlConverterTest#testIndexOperat
> > or
> > sBigQuery
>
>
> org.apache.calcite.rel.rel2sql.RelToSqlConverterTest#testIndexWithoutO
> peratorBigQuery
>
>
> I also checked the BigQuery doc which only mentions these standard
> usages in our test case of *testIndexOperatorsBigQuery*, but in fact
> when we use SQL like
>
> > select split('aaa-bbb-ccc', '-')[0]
>
> the results are available in BigQuery and the behavior is the same as
>
> > select split('aaa-bbb-ccc', '-')[ offset(0)]
>
> Maybe Calcite should support this kind of usage rather than treat it
> as a limitation.
>
>
>
> On Tue, Aug 8, 2023 at 10:31 AM P.F. ZHAN <[email protected] <mailto:
> [email protected]> > wrote:
>
> > Thank you mbudiu, Ran Tao and Lake Shen.
> >
> > I have tried your suggestions and debugged the code, this kind of
> > sql seems not support
> >
> >> select split('aaa-bbb-ccc', '-')[1]
> >>
> >
> > I have checked it in both Bigquery and Spark, it works. This looks
> > like a bug at present. Can I open a jira issue to follow this problem?
> >
> >
> > On Tue, Aug 8, 2023 at 1:09 AM <[email protected] <mailto:
> [email protected]> > wrote:
> >
> >> See also this thread:
> >> https://issues.apache.org/jira/browse/CALCITE-5760
> >>
> >> You can run CalciteSqlOperatorTest instead.
> >> If you want to quickly debug only this test in the IDE you can
> >> copy-paste this test into the CalciteSqlOperatorTest class and run
> >> it from there, then delete it when you are done.
> >>
> >> Mihai
> >>
> >> -----Original Message-----
> >> From: Ran Tao
> >> Sent: Monday, August 07, 2023 4:04 AM
> >> To: [email protected] <mailto:[email protected]>
> >> Subject: Re: [Discussion] Split function has some wrong and strange
> >> behaviors
> >>
> >> > I found that when I use "*./gradlew build*" to build the project
> >> > happens
> >> with some errors, but I execute this case in IDEA alone and it is
> >> successful.
> >>
> >> The default `fixture()` does not execute in unit tests (but works
> >> in ./gradlew build), you should use
> >> Fixture.forOperators(true) to execute in unit tests. You can check
> >> this usage, hope it helps you.
> >>
> >>
> >> Best Regards,
> >> Ran Tao
> >> https://github.com/chucheng92
> >>
> >>
> >> P.F. ZHAN <[email protected] <mailto:[email protected]> >
> 于2023年8月7日周一 15:22写道:
> >>
> >> > Sorry, I made a mistake, used Chinese to describe my question.
> >> >
> >> > Hi, When I was testing the split function with a case as follows:
> >> > org.apache.calcite.test.SqlOperatorTest#testSplitFunction
> >> >
> >> > > f.checkScalar("SPLIT('h-e-l-l-o', '-')[0]", "h", "VARCHAR");
> >> > >
> >> >
> >> > I found that when I use "*./gradlew build*" to build the
> >> > project happens with some errors, but I execute this case in IDEA
> >> > alone and it is successful.
> >> >
> >> > FAILURE 0.3sec, org.apache.calcite.test.CalciteSqlOperatorTest >
> >> > > testSplitFunction()
> >> > > java.lang.AssertionError: Query: values (SPLIT('h-e-l-l-o',
> >> '-')[0])
> >> > > Expected: is "h"
> >> > > but: was null
> >> > > at
> >> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
> >> > > at
> >> > >
> >> > org.apache.calcite.sql.test.ResultCheckers.compareResultSetWithMa
> >> > tc
> >> > her
> >> > (ResultCheckers.java:252)
> >> > > at
> >> > >
> >> > org.apache.calcite.sql.test.ResultCheckers$MatcherResultChecker.c
> >> > he
> >> > ckR
> >> > esult(ResultCheckers.java:321)
> >> > > at
> >> > >
> >> > org.apache.calcite.test.SqlOperatorTest$TesterImpl.check(SqlOpera
> >> > to
> >> > rTe
> >> > st.java:12150)
> >> > > at
> >> > org.apache.calcite.sql.test.SqlTester.check(SqlTester.java:160)
> >> > > at
> >> > >
> >> > org.apache.calcite.test.SqlOperatorFixtureImpl.lambda$checkScalar
> >> > $2
> >> > (Sq
> >> > lOperatorFixtureImpl.java:224)
> >> > > at
> >> > >
> >> > org.apache.calcite.sql.test.AbstractSqlTester.forEachQuery(Abstra
> >> > ct
> >> > Sql
> >> > Tester.java:446)
> >> > > at
> >> > >
> >> > org.apache.calcite.test.SqlOperatorFixtureImpl.checkScalar(SqlOpe
> >> > ra
> >> > tor
> >> > FixtureImpl.java:223)
> >> > > at
> >> > >
> >> > org.apache.calcite.sql.test.SqlOperatorFixture.checkScalar(SqlOpe
> >> > ra
> >> > tor
> >> > Fixture.java:238)
> >> > > at
> >> > >
> >> > org.apache.calcite.test.SqlOperatorTest.testSplitFunction(SqlOper
> >> > at
> >> > orT
> >> > est.java:7657)
> >> > > at
> >> > >
> >> > org.junit.platform.commons.util.ReflectionUtils.invokeMethod(Refl
> >> > ec
> >> > tio
> >> > nUtils.java:727)
> >> > >
> >> >
> >> > I also add a case in the file big-query.iq <http://big-query.iq>
> >> >
> >> > SELECT SPLIT("h,e,l,l,o", ',')[1] as result;
> >> > > +------- +
> >> > > | result |
> >> > > +------- +
> >> > > | e |
> >> > > +------- +
> >> > > (1 row)
> >> > >
> >> > > !ok
> >> > >
> >> >
> >> > The error log is as follows:
> >> >
> >> > > FAILURE 27.6sec, org.apache.calcite.test.BabelQuidemTest >
> >> > > test(String)[3], [3] sql/big-query.iq <http://big-query.iq>
> >> > > org.opentest4j.AssertionFailedError: Files differ:
> >> > > ~/calcite/babel/build/quidem/test/sql/big-query.iq <
> http://big-query.iq>
> >> > > ~/calcite/babel/build/resources/test/sql/big-query.iq <
> http://big-query.iq>
> >> > > 759c759
> >> > > < +------- +
> >> > > ---
> >> > > > +--------+
> >> > > 761,763c761,763
> >> > > < +------- +
> >> > > < | e |
> >> > > < +------- +
> >> > > ---
> >> > > > +--------+
> >> > > > | h |
> >> > > > +--------+
> >> > >
> >> > > at
> >> > > org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:38)
> >> > > at
> org.junit.jupiter.api.Assertions.fail(Assertions.java:135)
> >> > > at
> >> > org.apache.calcite.test.QuidemTest.checkRun(QuidemTest.java:183)
> >> > > at
> >> > > org.apache.calcite.test.QuidemTest.test(QuidemTest.java:236)
> >> > >
> >> > >
> >> > On Mon, Aug 7, 2023 at 3:13 PM P.F. ZHAN <[email protected] <mailto:
> [email protected]> > wrote:
> >> >
> >> > > 大家好。我在测试 split 函数功能的时候,添加了个 case
> >> > > org.apache.calcite.test.SqlOperatorTest#testSplitFunction
> >> > >
> >> > >> f.checkScalar("SPLIT('h-e-l-l-o', '-')[0]", "h", "VARCHAR");
> >> > >>
> >> > >
> >> > > 我发现我使用 gradlew build 整个项目时会出错,但我单独在 IDEA 里面执行这个 case 是成功的。
> >> > >
> >> > >> FAILURE 0.3sec, org.apache.calcite.test.CalciteSqlOperatorTest >
> >> > >> testSplitFunction()
> >> > >> java.lang.AssertionError: Query: values
> >> > >> (SPLIT('h-e-l-l-o',
> >> '-')[0])
> >> > >> Expected: is "h"
> >> > >> but: was null
> >> > >> at
> >> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
> >> > >> at
> >> > >>
> >> > org.apache.calcite.sql.test.ResultCheckers.compareResultSetWithMa
> >> > tc
> >> > her
> >> > (ResultCheckers.java:252)
> >> > >> at
> >> > >>
> >> > org.apache.calcite.sql.test.ResultCheckers$MatcherResultChecker.c
> >> > he
> >> > ckR
> >> > esult(ResultCheckers.java:321)
> >> > >> at
> >> > >>
> >> > org.apache.calcite.test.SqlOperatorTest$TesterImpl.check(SqlOpera
> >> > to
> >> > rTe
> >> > st.java:12150)
> >> > >> at
> >> > org.apache.calcite.sql.test.SqlTester.check(SqlTester.java:160)
> >> > >> at
> >> > >>
> >> > org.apache.calcite.test.SqlOperatorFixtureImpl.lambda$checkScalar
> >> > $2
> >> > (Sq
> >> > lOperatorFixtureImpl.java:224)
> >> > >> at
> >> > >>
> >> > org.apache.calcite.sql.test.AbstractSqlTester.forEachQuery(Abstra
> >> > ct
> >> > Sql
> >> > Tester.java:446)
> >> > >> at
> >> > >>
> >> > org.apache.calcite.test.SqlOperatorFixtureImpl.checkScalar(SqlOpe
> >> > ra
> >> > tor
> >> > FixtureImpl.java:223)
> >> > >> at
> >> > >>
> >> > org.apache.calcite.sql.test.SqlOperatorFixture.checkScalar(SqlOpe
> >> > ra
> >> > tor
> >> > Fixture.java:238)
> >> > >> at
> >> > >>
> >> > org.apache.calcite.test.SqlOperatorTest.testSplitFunction(SqlOper
> >> > at
> >> > orT
> >> > est.java:7657)
> >> > >> at
> >> > >>
> >> > org.junit.platform.commons.util.ReflectionUtils.invokeMethod(Refl
> >> > ec
> >> > tio
> >> > nUtils.java:727)
> >> > >>
> >> > >
> >> > > 我在big-query.iq <http://big-query.iq> 中加入下面这段
> >> > >
> >> > >> SELECT SPLIT("h,e,l,l,o", ',')[1] as result;
> >> > >> +------- +
> >> > >> | result |
> >> > >> +------- +
> >> > >> | e |
> >> > >> +------- +
> >> > >> (1 row)
> >> > >>
> >> > >> !ok
> >> > >>
> >> > >
> >> > > 报错如下:
> >> > >
> >> > >> FAILURE 27.6sec, org.apache.calcite.test.BabelQuidemTest >
> >> > >> test(String)[3], [3] sql/big-query.iq <http://big-query.iq>
> >> > >> org.opentest4j.AssertionFailedError: Files differ:
> >> > >> ~/calcite/babel/build/quidem/test/sql/big-query.iq <
> http://big-query.iq>
> >> > >> ~/calcite/babel/build/resources/test/sql/big-query.iq <
> http://big-query.iq>
> >> > >> 759c759
> >> > >> < +------- +
> >> > >> ---
> >> > >> > +--------+
> >> > >> 761,763c761,763
> >> > >> < +------- +
> >> > >> < | e |
> >> > >> < +------- +
> >> > >> ---
> >> > >> > +--------+
> >> > >> > | h |
> >> > >> > +--------+
> >> > >>
> >> > >> at
> >> > >> org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:38)
> >> > >> at
> org.junit.jupiter.api.Assertions.fail(Assertions.java:135)
> >> > >> at
> >> > >> org.apache.calcite.test.QuidemTest.checkRun(QuidemTest.java:183)
> >> > >> at
> >> > >> org.apache.calcite.test.QuidemTest.test(QuidemTest.java:236)
> >> > >>
> >> > >
> >> >
> >>
> >>
>
>