Date: Tue, 24 Sep 2024 14:27:48 +0200 From: tlaro...@kergis.com Message-ID: <zvkwrpohoo5ez...@kergis.com>
| This can be solved by resetting nread to 0 when an actual end-of-line | is reached and escaped. I think it better to just not have a limit in the normal case, it serves no purpose, except for this one rather exotic use of the read builtin (which really is meant for reading text files - the "variation" was to allow it to read \0 delimited "records" as output from find -print0 and similar.). It cannot really read binary blobs, no matter what is done, as sh variables cannot contain \0 characters (ever). That doesn't matter for the present purpose, but nor does much else here. | I gave a quick look to the bash(1) man page, that has two differing | options: -n (max number read) and -N (read exactly this number). | | I have not looked at the '-N' case in details (it seems to me overly | too complex to get right for whatever a user might want regarding bytes | read vs "chars" actually ending in the variables). | | For '-n', if I understand correctly, this is the number of bytes read, | without consideration of "char"s and, in this sense, escaping sequences. The bash manual says "characters" in both cases, but I'm not sure that it really means that, and certainly for us the difference is moot, as sh really wants 1 byte == 1 character, almost always (it can process UTF-8 and similar, it because it mostly doesn't need to interpret the strings as characters, just a byte strings). | This is why I use "bytes" for the count, to treat it differently from | "char" that may be an interpretation of a sequence of bytes. Yes, that part isn't the issue - the issue is that if "read" reads N bytes (characters) [0..N-1] (and after processing assigns them to variables) then another following read must start at the very next byte [N], read isn't allowed to simply discard anything not explicitly specified -- that is it can remove \ chars if -r isn't given, and always removes the delimiter char, if found, but it cannot actually read 128 bytes, and then just process 100 of them, as there's no way to put back the other 28 (particularly when reading from a pipe). That's why it reads 1 byte at a time, and never reads the next unless it is needed. The other versions (ignoring zsh where -n means something totally unrelated) all put the terminal into raw mode (or the equivalent) when -n is specified, so as soon as n characters have been read the read can stop - otherwise the terminal driver won't return anything until the user enters a \n (and while the 1 byte at a time read scheme avoid reading more than N of the bytes entered, leaving the rest for later, if one does "read -n 1 var" and the read doesn't return after 1 byte is typed (which it does in the other shells) people will be unhappy. I am looking at how to make something reasonable work. It won't happen within a day or two however. kre