On Thu, Jan 5, 2023 at 10:33 PM Jacob Bachmeyer <jcb62...@gmail.com> wrote: > > NightStrike wrote: > > On Fri, Dec 23, 2022 at 11:00 PM Jacob Bachmeyer <jcb62...@gmail.com> wrote: > > > >> NightStrike wrote: > >> > >>> On Wed, Dec 21, 2022 at 11:37 PM Jacob Bachmeyer <jcb62...@gmail.com> > >>> wrote: > >>> > >>>> [...] > >>> So at least we know for sure that this particular instance of extra > >>> characters is coming from Wine. Maybe Wine can be smart enough to > >>> only translate \n into \r\n instead of translating \r\n into \r\r\n. > >>> Jacek / Eric, comments here? I'm happy to try another patch, the > >>> first one was great. > >>> > >>> > >> I doubt that Wine is doing that translation. MinGW libc produces output > >> conformant to Windows conventions, so printf("\n") on a text handle > >> emits "\r\n", which Wine passes along. POSIX convention is that "\n" is > >> translated to "\r\n" in the kernel terminal driver upon output, so the > >> kernel translates the "\n" in the "\r\n" into /another/ "\r\n", yielding > >> "\r\r\n" at the pty master end. This is why DejaGnu testsuites must be > >> prepared to discard excess carriage returns. The first CR came from > >> MinGW libc; the second CR came from the kernel terminal driver; the LF > >> was ultimately passed through. > >> > > > > Jacek and I have been digging into this on IRC, and he's been very > > helpful in trying to get further, but we're still stuck. We tried to > > be more introspective, inserting strace both as "strace script wine" > > and as "script strace wine". We tried running just "wine a.exe" > > without any extra glue, and logging the raw SSH packets from putty. > > After many iterations on these and other tests, Jacek finally had the > > idea to try removing Windows entirely from the equation, and we ran > > with a purely unix program / compiler combination: > > > > #include <unistd.h> > > > > int main() > > { > > write(1, "test\r\n", 6); > > return 0; > > } > > > > (and also as "test\n", 5) > > > > In both versions, the following was observed: > > > > case 1) ./a.out | xxd > > case 2) script -c ./a.out out; xxd out > > case 3) enable putting logging, ./a.out > > > > In case 1, xxd showed no extra \r's. In cases 2 and 3, there was an > > extra \r (either 0d 0d 0a for test\r\n, or 0d 0a for test\n). > > > > So, is it possible after all of this back and forth regarding mingw, > > wine, and others, that it's down to the write() system call that's > > inserting extra \r's? Is this expected? > > > > "This is why DejaGnu testsuites must be prepared to discard excess > carriage returns." > > The write(2) system call inserts nothing and simply hands off the buffer > to the relevant part of the kernel I/O subsystem. (The kernel in POSIX > is *not* a monolithic black box.) When stdout for your test program is > a pty slave, that relevant part is the kernel terminal driver. The > kernel terminal driver is converting "\n" to "\r\n" upon output to the > associated port, since hardware terminals typically *do* require CRLF. > The associated port in this case is virtual and part of the kernel pty > subsystem, which presents octets written to that port to its associated > pty master device. The user-visible pty slave device acts just like a > serial terminal, including all translations normally done for handling > serial terminals. > > A pty is conceptually a null-modem cable connected between two > infinitely-fast serial ports on the same machine, although the slave > will still report an actual baud rate if queried. (Run "stty" with no > arguments under script(1), an ssh session, or an X11 terminal emulator > to see what a pty slave looks like on your machine.) > > In your case 1, the pty subsystem is not used and output is collected > over a pipe. Using "./a.out > out; xxd out" would produce the same > results. In cases 2 and 3, there is a pty involved, either set up by > script(1) or by sshd (assuming you meant "enable putty logging" in case > 3) that performs the standard terminal translations. In all cases, > strace(1) will show the exact string written to the pty slave device, > which will not include any extra CRs because *those* *are* *inserted* > *by* *the* *kernel* *terminal* *driver* as the data is transferred to > the pty master device's read queue. > > This insertion of carriage returns is expected and standardized behavior > in POSIX and is the reason Unix could use bare LF as end-of-line even > though hardware terminals always needed CRLF. CP/M (and therefore > MS-DOS which began its existence as a cheap CP/M knockoff) did not have > this translation layer and instead dumped the complexity of a two-octet > end-of-line sequence on user programs, leading to much confusion even > today. This is not a Wine issue, although the terminal escape sequences > in your original issue *were* from Wine. Note that the number of excess > carriage returns that a DejaGnu testsuite must be prepared to discard is > unspecified because running tests on remote targets may result in *any* > *number* of CRs preceding each LF by the time the results reach the test > driver machine in more complex testing lab environments.
First, I just want to thank you for your patience. You are putting a lot of effort into these replies, and it is appreciated. I did another little test to try to better understand your point. I ran a linux native testsuite under a simulator that just sets SIM to " ". This resulted in extra ^M's also, although many tests pass because they're already looking for \r\n to accommodate windows. So I think I've come around to grasp what you've been heroically re-explaining... So if we have to modify every test in the entire testsuite to check for zero or more \r's followed by zero or more \n's, would it be better to add a dg-output-line proc that does this automatically everywhere? I feel like changing every output pattern test won't be too maintainable. You had mentioned previously modifying ${tool}_load to filter out extra \r's, but I couldn't see where or how to do that. For completeness, setting a random selection of tests to look for \r*\n? worked (this would cover even deprecated systems that only use CR as well as flagging the weird rust case of \r\r\n\n as bad).