looks fine except that processing all Unicode whitespace characters might
add overhead to the parsing process, potentially impacting performance.
Although I think this is a moot point
+1
Mich Talebzadeh,
Technologist | Solutions Architect | Data Engineer | Generative AI
London
United Kingdom
+1, this is a reasonable change.
Gengliang
On Wed, Mar 27, 2024 at 9:54 AM serge rielau.com wrote:
> Going once, going twice, …. last call for objections
> On Mar 23, 2024 at 5:29 PM -0700, serge rielau.com ,
> wrote:
>
> Hello,
>
> I have a PR https://github.com/apache/spark/pull/45620 ready
Going once, going twice, …. last call for objections
On Mar 23, 2024 at 5:29 PM -0700, serge rielau.com , wrote:
Hello,
I have a PR https://github.com/apache/spark/pull/45620 ready to go that will
extend the definition of whitespace (what separates token) from the small set
of ASCII characters
Yeah I heard about that. This IMHO is a bit more worrying, and we do not have
teh "excuse" that it is transparent.
Also, which of these would be STRING and which IDENTIFIER?
On Mar 25, 2024 at 1:06 PM -0700, Alex Cruise , wrote:
While we're at it, maybe consider allowing "smart quotes" too :)
-0
While we're at it, maybe consider allowing "smart quotes" too :)
-0xe1a
On Sat, Mar 23, 2024 at 5:29 PM serge rielau.com wrote:
> Hello,
>
> I have a PR https://github.com/apache/spark/pull/45620 ready to go that
> will extend the definition of whitespace (what separates token) from the
> smal