> Strings, after construction, are immutable but may be constructed from > mutable arrays of bytes, characters, or integers. > The string constructors should guard against the effects of mutating the > arrays during construction that might invalidate internal invariants for the > correct behavior of operations on the resulting strings. In particular, a > number of operations have optimizations for operations on pairs of latin1 > strings and pairs of non-latin1 strings, while operations between latin1 and > non-latin1 strings use a more general implementation. > > The changes include: > > - Adding a warning to each constructor with an array as an argument to > indicate that the results are indeterminate > if the input array is modified before the constructor returns. > The resulting string may contain any combination of characters sampled from > the input array. > > - Ensure that strings that are represented as non-latin1 contain at least one > non-latin1 character. > For latin1 inputs, whether the arrays contain ASCII, ISO-8859-1, UTF8, or > another encoding decoded to latin1 the scanning and compression is unchanged. > If a non-latin1 character is found, the string is represented as non-latin1 > with the added verification that a non-latin1 character is present at the > same index. > If that character is found to be latin1, then the input array has been > modified and the result of the scan may be incorrect. > Though a ConcurrentModificationException could be thrown, the risk to an > existing application of an unexpected exception should be avoided. > Instead, the non-latin1 copy of the input is re-scanned and compressed; > that scan determines whether the latin1 or the non-latin1 representation is > returned. > > - The methods that scan for non-latin1 characters and their intrinsic > implementations are updated to return the index of the non-latin1 character. > > - String construction from StringBuilder and CharSequence must also be > guarded as their contents may be modified during construction.
Roger Riggs has updated the pull request incrementally with two additional commits since the last revision: - code and doc cleanup in StringRacyConstructor test - Update of string_compress for the s390 port to return the index of the non-latin1 char. Contributed by Amit Kumar. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16425/files - new: https://git.openjdk.org/jdk/pull/16425/files/ad73a2a6..f6080595 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16425&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16425&range=01-02 Stats: 11 lines in 2 files changed: 0 ins; 1 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/16425.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16425/head:pull/16425 PR: https://git.openjdk.org/jdk/pull/16425