On Jan 28 18:07:55, tomas.ri...@tuta.com wrote: > Hi and thank to all of you responding. > > My explanation and understanding: > > myfile.txt example: > Hello guys > <= one space here > Anybody from Europe? <= two spaces between Anybody and from
No, that's not what's in the input. This is actualy in your input: 48 U+000048 H LATIN CAPITAL LETTER H 65 U+000065 e LATIN SMALL LETTER E 6c U+00006c l LATIN SMALL LETTER L 6c U+00006c l LATIN SMALL LETTER L 6f U+00006f o LATIN SMALL LETTER O 20 U+000020 SPACE 67 U+000067 g LATIN SMALL LETTER G 75 U+000075 u LATIN SMALL LETTER U 79 U+000079 y LATIN SMALL LETTER Y 73 U+000073 s LATIN SMALL LETTER S 0a U+00000a LINE FEED (LF) c2a0 U+0000a0 NON-BREAKING SPACE 0a U+00000a LINE FEED (LF) 41 U+000041 A LATIN CAPITAL LETTER A 6e U+00006e n LATIN SMALL LETTER N 79 U+000079 y LATIN SMALL LETTER Y 62 U+000062 b LATIN SMALL LETTER B 6f U+00006f o LATIN SMALL LETTER O 64 U+000064 d LATIN SMALL LETTER D 79 U+000079 y LATIN SMALL LETTER Y c2a0 U+0000a0 NON-BREAKING SPACE 20 U+000020 SPACE 66 U+000066 f LATIN SMALL LETTER F 72 U+000072 r LATIN SMALL LETTER R 6f U+00006f o LATIN SMALL LETTER O 6d U+00006d m LATIN SMALL LETTER M 20 U+000020 SPACE 45 U+000045 E LATIN CAPITAL LETTER E 75 U+000075 u LATIN SMALL LETTER U 72 U+000072 r LATIN SMALL LETTER R 6f U+00006f o LATIN SMALL LETTER O 70 U+000070 p LATIN SMALL LETTER P 65 U+000065 e LATIN SMALL LETTER E 3f U+00003f ? QUESTION MARK c2a0 U+0000a0 NON-BREAKING SPACE 0a U+00000a LINE FEED (LF) > tr -c "[:alpha:]" "\n" < myfile.txt > > myfile.txt is INPUT > tr finds the complement to :alpha: and replaces them by \n, > resulting in: > START-OF-FILEmyfile > txt Nothing like that is in the input you show. On the other hand, the "Hello guys" just disappeared? You are _not_ showing the actual run and the actual result. If you are new to unix, get familiar with script(1). Run your example (cat myfile.txt, then the tr commands) inside script(1) and post the resulting typescript. > > Anybody > > from > Europe > > EOF This is what tr(1) does for me: ---------------- Hello guys Anybody from Europe ---------------- which is 48 U+000048 H LATIN CAPITAL LETTER H 65 U+000065 e LATIN SMALL LETTER E 6c U+00006c l LATIN SMALL LETTER L 6c U+00006c l LATIN SMALL LETTER L 6f U+00006f o LATIN SMALL LETTER O 0a U+00000a LINE FEED (LF) 67 U+000067 g LATIN SMALL LETTER G 75 U+000075 u LATIN SMALL LETTER U 79 U+000079 y LATIN SMALL LETTER Y 73 U+000073 s LATIN SMALL LETTER S 0a U+00000a LINE FEED (LF) 0a U+00000a LINE FEED (LF) 0a U+00000a LINE FEED (LF) 0a U+00000a LINE FEED (LF) 41 U+000041 A LATIN CAPITAL LETTER A 6e U+00006e n LATIN SMALL LETTER N 79 U+000079 y LATIN SMALL LETTER Y 62 U+000062 b LATIN SMALL LETTER B 6f U+00006f o LATIN SMALL LETTER O 64 U+000064 d LATIN SMALL LETTER D 79 U+000079 y LATIN SMALL LETTER Y 0a U+00000a LINE FEED (LF) 0a U+00000a LINE FEED (LF) 0a U+00000a LINE FEED (LF) 66 U+000066 f LATIN SMALL LETTER F 72 U+000072 r LATIN SMALL LETTER R 6f U+00006f o LATIN SMALL LETTER O 6d U+00006d m LATIN SMALL LETTER M 0a U+00000a LINE FEED (LF) 45 U+000045 E LATIN CAPITAL LETTER E 75 U+000075 u LATIN SMALL LETTER U 72 U+000072 r LATIN SMALL LETTER R 6f U+00006f o LATIN SMALL LETTER O 70 U+000070 p LATIN SMALL LETTER P 65 U+000065 e LATIN SMALL LETTER E 0a U+00000a LINE FEED (LF) 0a U+00000a LINE FEED (LF) 0a U+00000a LINE FEED (LF) 0a U+00000a LINE FEED (LF) You are not showing the actual result. > Should I have included -s, it would have removed multiple occurences > of \n from the OUTPUT and the result would have been: > > myfile > txt > Anybody > from > Europe > EOF The result of tr -cs "[:alpha:]" "\n" < myfile.txt on your actual input is ------------ Hello guys Anybody from Europe ------------ > As the tr(1) states: This (the squeeze) occurs AFTER all deletion and > translation is completed. Yes: tr(1) replaces all the non-[:alpha:]s with a newline, and then squeezes the multiple consecutive occurences of newlines (such as the four newlines after "guys") into one newline. Here it is once again, with pipes instead of newlines for readability: $ tr -c "[:alpha:]" "|" < myfile.txt Hello|guys||||Anybody|||from|Europe|||| $ tr -cs "[:alpha:]" "|" < myfile.txt Hello|guys|Anybody|from|Europe| > So I still believe there should be OUTPUT in the -s description. Being a non-native speaker myself, I can sympathize with fighting the ambiguity. Is the following wording better? It is still not _absolutely_ clear, as it talks about "_the_ character" - but which one, if string2 is longer then one? $ tr -c "[:alpha:]" "XY" < myfile.txt $ tr -cs "[:alpha:]" "XY" < myfile.txt Jan --- tr.1.orig Tue Jan 28 19:08:31 2025 +++ tr.1 Tue Jan 28 19:12:08 2025 @@ -80,15 +80,15 @@ The .Fl d option causes characters to be deleted from the input. .It Fl s -The +After all deletion and translation is completed, +the .Fl s -option squeezes multiple occurrences of the characters listed in the last -operand (either +option squeezes multiple consecutive occurrences of the characters +listed in the last operand (either .Ar string1 or .Ar string2 ) -in the input into a single instance of the character. -This occurs after all deletion and translation is completed. +into a single instance of the character. .El .Pp In the first synopsis form, the characters in