On Fri, 6 Jan 2023 at 15:44, Alex Bennée <alex.ben...@linaro.org> wrote: > Peter Maydell <peter.mayd...@linaro.org> writes: > > The semihosting API, at least for Arm, has a modeflags string so the > > guest can say whether it wants to open O_BINARY or not: > > https://github.com/ARM-software/abi-aa/blob/main/semihosting/semihosting.rst#sys-open-0x01 > > > > So we need to plumb that down through the common semihosting code > > into this function and set O_BINARY accordingly. Otherwise guest > > code that asks for a text-mode file won't get one. > > We used to, in fact we still have a remnant of the code where we do: > > #ifndef O_BINARY > #define O_BINARY 0 > #endif > > I presume because the only places it exists in libc is wrapped in stuff > like: > > #if defined (__CYGWIN__) > #define O_BINARY _FBINARY > > So the mapping got removed in a1a2a3e609 (semihosting: Remove > GDB_O_BINARY) because GDB knows nothing of this and as far as I can tell > neither does Linux whatever ISO C might say about it. > > Is this a host detail leakage to the guest? Should a semihosting app be > caring about what fopen() modes the underlying host supports? At least a > default O_BINARY for windows is most likely to DTRT.
I think the theory when the semihosting API was originally designed decades ago was basically "when the guest does fopen(...) this should act like it does on the host". So as a bit of portable guest code you would say whether you wanted a binary or a text file, and the effect would be that if you were running on Windows and you output a text file then you'd get \r\n like the user probably expected, and if on Linux you get \n. The gdb remote protocol, on the other hand, assumes "all files are binary", and the gdb source that implements the gdb remote file I/O operations does "always set O_BINARY if it's defined": https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/remote-fileio.c;h=3ff2a65b0ec6c7695f8659690a8f1dce9b5cdf5f;hb=HEAD#l141 So this is kind of an impedance mismatch problem -- the semihosting API wants functionality that the gdb protocol can't give us. But we don't have that mismatch issue if we're directly making host filesystem calls, because there we can set O_BINARY or not as we choose. Alternatively, we could decide that our implementation of semihosting consistently uses \n for the newline character on all hosts, such that guests which try to write text files on Windows hosts get the "wrong" newline type, but OTOH get consistently the same file regardless of host and regardless of whether semihosting is going via gdb or not. But if we want to do that we should at least note in a comment somewhere that that's a behaviour we've chosen, not something that's happened by accident. Given Windows is less unfriendly about dealing with \n-terminated files these days that might not be an unreasonable choice. -- PMM