Hi again,

Am 30.09.2011 00:27, schrieb Xueming Shen:
On 09/29/2011 02:16 PM, Ulf Zibis wrote:

 280                     if (Character.isSurrogate(c))
 281                         return malformedForLength(src, sp, dst, dp, 3);
Shouldn't we return cr.length() = 1to allow remaining 2 bytes to be interpreted 
again ?

Forget it! If c is a surrogate, b2 is in range A0..BF and b3 is in range 80..BF. Both can not be potentially well-formed as a first byte.


Actually I don't know the answer. My reading of D93a/D93b suggests that we might
interpret it as a whole, given the bytes are actually in well-formed byte 
pattern range
listed in Table 3.7, but "ill-formed" simply because they are surrogate value 
not scale
value, so I would interpret the whole 3 bytes as a maximal subpart. Given 
D93a/b is
"best practices for Using U+fffd", either way should be fine. We do have 
Unicode expert
on the list, so maybe they can share their opinion on what is the 
"desired"/recommended
behavior in this case, from Standard point view?

At line 102 you could insert:
        //  [E0]     [A0..BF]
        //  [E1..EF] [80..BF]

-Ulf

Reply via email to