Unicode, of course. Good point. Well, however an amendment might occur, "Each character in the expanded value of parameter is tested against pattern" to my ear reads as referring to alphabetic characters, as per sentence one, however they may appear in binary. It does also make some sense that a PE for case modification just might limit itself to only considering byte strings that resolve to alphabetic characters, including Unicode ones.
But sentence six is an emphatic concern if you ask me, for how "character" reads as "alphabetic character". Also how sentences five and six can be read as contradictory. So ${foo^x} is a PE dedicated to case modification that tests the first character of the string that $foo expands into, whatever type of character that might be. But why would it not alter the first alphabetic character if the first character in the string is not alphabetic? That the first character in a string would be an alphabetic one is an assumption, one that seems to be in use. Neither assumption - that the first character would be an alphabetic one, or that the algorithm would look for the first alphabetic character - is clearly stated. That ambiguity could be resolved. "First character in the expanded value," means something like "the group of bytes that prints at place zero as counted from the beginning of the entire byte string..." which is the result of the expansion of 'parameter.' Wiley On Thu, Jan 16, 2025 at 7:24 PM Lawrence Velázquez <v...@larryv.me> wrote: > On Thu, Jan 16, 2025, at 8:05 PM, Wiley Young wrote: > > In Parameter Expansions / Case Modification: > > > > [S1] > > The first sentence reads, "alphabetic characters," which can imply that > > wherever else in the paragraph the word "characters" is used, that the > > intended meaning is "alphabetic characters," which is not the case. It > can > > be read as something like a declaration of a specific conversation topic. > > This multiplicity of meaning in a single word, "characters," can be > thought > > of as similar to a namespace collision. > > It's never once occurred to me to read it this way. I think you're > overstating the potential confusion here. > > > > [S3] > > The third sentence makes no distinction between alphabetic characters > and > > ASCII characters, although in practice such a distinction exists. > > > > $ yy='./scripts_install-ups' > > $ echo ${yy^s} > > ./scripts_install-ups > > $ yy='scripts_install-ups' > > $ echo ${yy^s} > > Scripts_install-ups > > > > The third sentence could read more clearly with the inclusion of one > > word: "ASCII" (changes are underlined). > > > > Before: > > 3) Each character in the expanded value of parameter is tested > against > > pattern, and, if it matches the pattern, its case is converted. > > After: > > 3) Each _ASCII_ character in the expanded value of parameter is > tested > > against pattern, and, if it matches the pattern, its case is converted. > > > > Without the inclusion of the word "ASCII," sentence three (as well as > > sentence six, below) can easily be read incorrectly as applying only to > > "alphabetic characters" as a result of the implied topic declaration in > > sentence one. > > This suggestion is far more misleading than the existing text. It > implies that non-ASCII characters are skipped, but they are not. > > $ LC_ALL=en_US.UTF-8 > $ str=$'err\u00F3neo' > $ echo "$str" "${str^^}" > erróneo ERRÓNEO > > > > [S5] > > Sentence five speaks of "the ^ operator" and "the , operator" - however > > this use of English language is again unclear. There are four meaningful > > syntactical forms involved in this Parameter Expansion: > > > > , ^ ,, ^^ > > > > They are composed of varying combinations of two raw constituent ASCII > > characters: > > > > , ^ (\x2c and \x5e) > > > > There's an overlap, when the text reads, "the ^ operator" it can be > read > > as referring to either a specific and meaningful syntactical form, (^), > or > > a raw constituent ASCII character, (^). It appears that the latter > meaning > > is intended. Otherwise, the descriptions would contradict each other of > the > > single-character PE operators in both sentence five and the second half > of > > sentence six. Elsewhere in the manual the word "operator" is also used. > > From a partial search, it appears that the word tends to be used much > more > > often to refer to specific meaningful syntactical forms rather than raw > > ASCII characters. > > > > declare -a control_operators=( [0]="||" [1]="&" [2]="&&" ) > > declare -a parameter_transformation_operators=( [0]="U" [1]="u" > [2]="L" > > ) > > > > Sentence five could read more clearly as (changes underlined): > > > > Before: > > 5) The ^ operator converts lowercase letters matching pattern to > > uppercase; the , operator converts matching uppercase letters to > lowercase. > > > > After: > > 5) The _operators_ _containing_ _ASCII_ _^_ _characters_ _convert_ > > lowercase letters matching pattern to uppercase, and _operators_ > > _containing_ _ASCII_ _,_ _characters_ convert matching uppercase letters > to > > lowercase. > > I think this suggestion is pretty clumsy, but I do agree that it's > a little confusing to read about the "^ operator" and the ", operator" > but also the "^^ and ,, expansions" and the "^ and , expansions". > > > > [S6] > > Sentence six could benefit from a change similar to that of sentence > > three. > > > > Before: > > 6) The ^^ and ,, expansions convert each matched character in the > > expanded value; the ^ and , expansions match and convert only the first > > character in the expanded value. > > After: > > 6) The ^^ and ,, expansions convert each matched character in the > > expanded value; the ^ and , expansions match and convert only the first > > _ASCII_ character in the expanded value. > > This suggestion is invalid for the same reason as the first one. > > > -- > vq >