alexander-beedie opened a new pull request, #1655:
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1655

   Supports the `IS [NOT] [form] NORMALIZED -> bool` syntax:
   
   Details from the PostgreSQL string function docs:
   https://www.postgresql.org/docs/current/functions-string.html
   ```
   Checks whether the string is in the specified Unicode normalization
   form. The optional form key word specifies the form: NFC (the default),
   NFD, NFKC, or NFKD. This expression can only be used when the server
   encoding is UTF8. Note that checking for normalization using this
   expression is often faster than normalizing possibly already
   normalized strings.
   ```
   
   * NFC: Canonical Decomposition, followed by Canonical Composition.
   * NFD: Canonical Decomposition.
   * NFKC: Compatibility Decomposition, followed by Canonical Composition.
   * NFKD: Compatibility Decomposition.
   
   As the normalised forms are fixed (there are only these four), it seemed 
reasonable to return the specified one as a new `NormalizationForm` Enum (which 
helps the caller as they won't be responsible for normalising string case, and 
can jump straight into a constrained match block).
   
   We would use this in the Polars SQL interface to add support for this syntax.
   
   ## Examples
   ```sql
   strcol IS NORMALIZED
   strcol IS NOT NORMALIZED
   ```
   ```sql
   strcol IS NFKC NORMALIZED
   strcol IS NOT NFKD NORMALIZED
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to