dtenedor commented on PR #50284: URL: https://github.com/apache/spark/pull/50284#issuecomment-2730207770
Hi @dongjoon-hyun, this is a good question. We talked with Jeff Shute, the author of the SQL pipe syntax paper from Google. He mentions that they went with `|>` because all implementing engines can support it without any issues with their parsers. Specifically, Google uses an LALR parser [1] which makes it impossible to use `|` as the token due to ambiguity with bit operations. They won't be able to add support for this. It seems like a growing consensus in the industry is that we should all support `|>` as the primary token, but some engines may also decide to support alternative tokens in addition. This blog [2] describes the situation and mentions how another engine decided to go this direction. Please let us know your thoughts on this -- we can certainly hold off on merging this PR until we figure out together what we want the plan to be. [1] https://github.com/google/zetasql/blob/master/bazel/bison.bzl [2] https://superdb.org/docs/language/pipe-ambiguity/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org