Wes McKinney created ARROW-3536:
-----------------------------------
Summary: [C++] Fast UTF8 validation functions
Key: ARROW-3536
URL: https://issues.apache.org/jira/browse/ARROW-3536
Project: Apache Arrow
Issue Type: New Feature
Components: C++
Reporter: Wes McKinney
Fix For: 0.13.0
[~lemire] discusses this topic in
https://lemire.me/blog/2018/05/16/validating-utf-8-strings-using-as-little-as-0-7-cycles-per-byte/
In Java there is also
https://lemire.me/blog/2018/10/16/validating-utf-8-bytes-java-edition/
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)