On Thu, Dec 10, 2015 at 04:37:22PM +0800, Stefan Hajnoczi wrote: > On Mon, Dec 07, 2015 at 02:22:18PM +1100, Sam Bobroff wrote: > > On Fri, Dec 04, 2015 at 02:33:01PM +0800, Stefan Hajnoczi wrote: > > > On Mon, Nov 30, 2015 at 04:34:47PM +1100, Sam Bobroff wrote: > > > > Hi QEMU programmers, > > > > > > > > While doing some experimental work on QEMU that has involved adding a > > > > lot of > > > > new log messages (using qemu_log_mask()), I've discovered that under > > > > some > > > > conditions a lot of my log messages go missing. > > > > > > > > I've tracked the issue down to qemu_logfile being left at the default > > > > (stderr) > > > > (so when not using -D) and according to strace what is happening is > > > > that the > > > > debug messages are being passed to write() but write() is returning > > > > EWOULDBLOCK > > > > and the messags are dropped. > > > > > > > > This seems to be happening because stderr is being set non-blocking > > > > (which is a > > > > bit odd to me), and it seems like NONBLOCK is being set as > > > > qmp_chardev_add() adds a > > > > device for stdout (yes stdout, not stderr; perhaps file descriptors > > > > have been > > > > dup'd by that point?). > > > > > > > > Is this by design to prevent a slow console from blocking QEMU? If not, > > > > should > > > > I delve further and try to prevent non-blocking being set on stderr? > > > > > > > > (Unfortunately I don't have a replication for this using an unmodified > > > > QEMU but > > > > I suspect I could find one if necessary.) > > > > > > This sounds like a bug. stderr should be blocking. > > > > > > But it's specific to your QEMU build or your command-line. Here are my > > > results on an x86 host: > > > > > > $ qemu-system-x86_64 -enable-kvm -m 1024 -cpu host \ > > > -drive if=virtio,cache=none,file=test.img,format=raw > > > $ cat /proc/$(pgrep qemu)/fdinfo/2 > > > pos: 0 > > > flags: 0100002 > > > mnt_id: 22 > > > > > > The flags are O_RDWR | O_LARGEFILE. > > > > > > O_NONBLOCK would have been 04000 (octal). > > > > > > Please retry with qemu.git/master and post your QEMU command-line. > > > > > > Stefan > > > > Hi Stefan, > > > > Thanks for checking this! I think I've worked out what's happening and it's > > a > > bit complicated ;-) > > > > It appears that the behaviour is not directly related to the command line or > > build but rather the way the shell sets up the environment. > > > > Here's a test very similar to yours using origin/master QEMU on my Debian > > stable machine, with bash as my shell: > > > > $ qemu-system-x86_64 -S -vga none -nographic > > ( In another virtual console: ) > > $ cat /proc/$(pgrep qemu)/fdinfo/2 > > pos: 0 > > flags: 0104002 > > mnt_id: 19 > > > > So I seem to have O_NONBLOCK set on stderr. > > > > However, if I perform that test by backgrounding QEMU and using the same > > console: > > $ qemu-system-x86_64 -S -vga none -nographic & > > [1] 4970 > > [1]+ Stopped qemu-system-x86_64 -S -nographic -vga none > > $ cat /proc/$(pgrep qemu)/fdinfo/2 > > pos: 0 > > flags: 0100002 > > mnt_id: 19 > > > > Hmm! > > > > After a bit of reading, it seems like this is caused by the fact that bash, > > when there is no redirection, sets up fds 0, 1 and 2 by dup'ing a single > > file > > descriptor. I found this surprising! > > > > A small test program that sets O_NONBLOCK on fd 1 and checks for it on fd 2 > > seems to corroborate it: > > > > #include <stdio.h> > > #include <fcntl.h> > > #include <assert.h> > > > > int main() { > > int f1, f2; > > > > f1 = fcntl(1, F_GETFL, 0); > > f2 = fcntl(2, F_GETFL, 0); > > assert(!(f1 & O_NONBLOCK)); > > assert(!(f2 & O_NONBLOCK)); > > assert(!fcntl(1, F_SETFL, f1 | O_NONBLOCK)); > > if (fcntl(2, F_GETFL, 0) & O_NONBLOCK) > > fprintf(stderr, "File descriptors are duplicates.\n"); > > else > > fprintf(stderr, "File descriptors are separate.\n"); > > return 0; > > } > > > > gcc -Wall foo.c > > $ ./a.out > > File descriptors are duplicates. > > $ ./a.out > /dev/null > > File descriptors are separate. > > > > The nonblocking flag is always (?) being set on stdin and stdout during > > qemu_chr_open_stdio(), but it doesn't affect stderr unless they are dup'd. > > > > So this is probably not going to occur often, and certainly not when running > > under libvirt or other tools. But if it is actually what is happening, I'm > > not > > sure what to do about it. Any ideas? > > This is unfortunate. I wonder if opening /proc/fd/<n> is just a dup or > if it actually creates separate file status flags.
Calling open() always creates a new "file description" (which contains file status flags), so it's certainly a way to approach this. I've testing doing this if we're going to use stderr as the qemu_logfile: dup2(open("/proc/self/fd/2", O_WRONLY), 2) ... in do_qemu_set_log() and it fixes the issue, however... * Doing the dup2() causes it to fix any other fprintf()s to stderr, but I feel bad for changing the fd that's inside the stderr "FILE *" "underneath it". Would it be better to do the following... qemu_logfile = fopen("/proc/self/fd/2", "a") ... even though that will leave other fprintf()s unfixed. (I think that is probably OK.) * We could also unconditionally do the dup2(), maybe early in main(). * I'm not sure which platforms other than modern Linux support /proc/self/fd/<n>. (Just fixing it for Linux seems a lot better than nothing.) * We could print a warning if stderr is nonblocking and we can't work around it. This seems like a good idea even if don't fix it for Linux. * Alternatively, we could take this approach to stdin and stdout as they are added to the char driver and set nonblocking. This somehow feels more invasive, although I'm not sure why. * We could also handle it by converting the qemu_log_mask calls from fprintf() to sprintf() + write(), as we'd then be able to handle EAGAIN correctly. This doesn't feel any cleaner to me. Cheers, Sam.