> UTF-8 can’t encode all UNICODE characters.

Nikolay, could you please elaborate? My understanding is that encoding
we speak about matters for conversion from byte arrays to strings.
Does Java String support all unicode characters and particularly does
it support more characters than UTF-8 (I am not saying here that java
String uses UTF-8)?

2021-12-13 12:56 GMT+03:00, Ivan Daschinsky <ivanda...@gmail.com>:
> UTF-8 is already a default encoding in our BinaryObject format. So.... I am
> for unification.
>
> пн, 13 дек. 2021 г. в 12:50, Nikolay Izhikov <nizhi...@apache.org>:
>
>> Hello, Ivan.
>>
>> UTF-8 can’t encode all UNICODE characters.
>>
>> > 13 дек. 2021 г., в 12:49, Ivan Daschinsky <ivanda...@gmail.com>
>> написал(а):
>> >
>> > Khm, maybe a better variant is  to enforce all strings to be encoded in
>> > UTF-8?
>> > AFAIK multi OS cluster is a quite common case.
>> >
>> >
>> > пн, 13 дек. 2021 г. в 11:36, Mikhail Petrov <pmgheap....@gmail.com>:
>> >
>> >> Igniters,
>> >>
>> >> Recently we faced the problem that if the cluster consists of nodes
>> >> running in the JVM with different encodings, many issues arise.
>> >> The root cause of the mentioned issues is components that use
>> >> `String#getBytes()` and `new String(<byte array>)`, which relies on
>> >> the
>> >> system default encoding. Thus, if a string is deserialized on a node
>> >> with a different encoding from the one that serialized it, the
>> >> deserialized string can be different from the original one.
>> >>
>> >> For example:
>> >>
>> >> Serialization/deserialization of string in communication messages may
>> >> be
>> >> broken for some strings on nodes running in a JVM with a different
>> >> encoding as DirectByteBufferStreamImplV2 uses String#getBytes() to
>> >> serialize strings - [1]
>> >>
>> >> Or the IgniteAuthenticationProcessor can compute different security
>> >> IDs
>> >> for the user on different nodes in this case - [2]
>> >>
>> >> What do you think, if we solve this problem globally, by rejecting to
>> >> join nodes that run on JVMs with different encodings?
>> >>
>> >> As a result, we will be sure that all cluster nodes have the same
>> >> encoding and all related problems will be solved.
>> >>
>> >> [1] - https://issues.apache.org/jira/browse/IGNITE-16106
>> >> [2] - https://issues.apache.org/jira/browse/IGNITE-16068
>> >>
>> >> --
>> >> Mikhail
>> >>
>> >>
>> >
>> > --
>> > Sincerely yours, Ivan Daschinskiy
>>
>>
>
> --
> Sincerely yours, Ivan Daschinskiy
>


-- 

Best regards,
Ivan Pavlukhin

Reply via email to