Hi Jark, Thank you for your helpful suggestion. It appears that 'E'foo\n'' is a more versatile and widely accepted option. To assess its feasibility, I have reviewed the relevant Unicode supports and concluded that it may necessitate modifications to the Parser.jj file to accommodate this new syntax.
I am unsure whether we should initially incorporate this alteration in Calcite or if we can directly supersede the StringLiteral behavior within the Flink project. Nevertheless, I believe supporting this change is achievable. Thanks, Aitozi. Jark Wu <imj...@gmail.com> 于2023年3月6日周一 10:16写道: > Hi Aitozi, > > I think this is a good idea to improve the backslash escape strings. > However, I lean a bit more toward the Postgres approach[1], > which is more standard-compliant. PG allows backslash escape > string by writing the letter E (upper or lower case) just before the > opening single quote, e.g., E'foo\n'. > > Recognizing backslash escapes in both regular and escape string constants > is not backward compatible in Flink, and is also deprecated in PG. > > In addition, Flink also supports Unicode escape string constants by > writing the U& before the quote[1] which works in the same way with > backslash escape string. > > Best, > Jark > > [1]: > > https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-CONSTANTS > [2]: > > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/overview/ > > On Sat, 4 Mar 2023 at 23:31, Aitozi <gjying1...@gmail.com> wrote: > > > Hi, > > I encountered a problem when using string literal in Flink. Currently, > > Flink will escape the string literal during codegen, so for the query > > below: > > > > SELECT 'a\nb'; it will print => a\nb > > > > then for the query > > > > SELECT SPLIT_INDEX(col, '\n', 0); > > > > The col can not split by the newline. If we want to split by the newline, > > we should use > > > > SELECT SPLIT_INDEX(col, ' > > ', 0) > > > > or > > > > SELECT SPLIT_INDEX(col, CHR(10), 0) > > > > The above way could be more intuitive. Some other databases support these > > "Special Character Escape Sequences"[1]. > > > > In this way, we can directly use > > SELECT SPLIT_INDEX(col, '\n', 0); for the query. > > > > I know this is not standard behavior in ANSI SQL. I'm opening this thread > > for some opinions from the community guys. > > > > [1]: > > > > > https://dev.mysql.com/doc/refman/8.0/en/string-literals.html#character-escape-sequences > > > > Thanks, > > Aitozi > > >