Re: [DISCUSS] UTF-8 validation

2023-07-02 Thread Raphael Taylor-Davies
For better or for worse the Rust implementation requires the underlying buffer is UTF-8 including null slots, as this allows returning the buffer as a native string type, which in turn allows kernels to use Rust's native string functionality. Whilst I agree the specification is ambiguous on this

Re: [DISCUSS] UTF-8 validation

2023-07-02 Thread Antoine Pitrou
Le 02/07/2023 à 14:00, Raphael Taylor-Davies a écrit : More an observation than an issue, but UTF-8 validation for StringArray can be done very efficiently by first verifying the entire buffer, and then verifying the offsets correspond to the start of a UTF-8 codepoint. Caveat: null slots