> From: Mark H Weaver <m...@netris.org> > Cc: l...@gnu.org, guile-devel@gnu.org > Date: Tue, 12 Aug 2014 14:08:48 -0400 > > > +/* The funny conditional avoids a compiler warning in status:stop_sig. */ > > +# define WIFSTOPPED(stat_val) ((stat_val) == (stat_val) ? 0 : 0) > > What is the warning?
That stat_val is unused (because the value is always zero). > Would ((stat_val), 0) also fix it? I'm not sure, but I will use it if it will. > > +# define getuid() (500) /* Local Administrator */ > > +# define getgid() (513) /* None */ > > +# define setuid(u) (0) > > +# define setgid(g) (0) > > As I've said before, I'm not okay with making 'setuid', 'seteuid', > 'setgid', or 'setegid' into no-ops. They should at least raise an error > when called on Windows, and maybe they should be undefined. I'll respond to this in a separate thread. > > + /* If they ask for the Unix system shell, try to find it on PATH. */ > > + if (c_strcasecmp (program, "/bin/sh") == 0) > > + { > > + bin_sh_requested = 1; > > + program = "sh.exe"; > > + } > > Hmm. I'm not so comfortable with such cleverness at this low level. You mean, that we look for 'sh' not in '/bin', but along PATH? What else is reasonable? there's no standard '/bin' directory on Windows systems, certainly not on every drive. Without something like that, we will never be able to support portably shell commands that require a Unixy shell, because on Unix you _must_ use "/bin/sh" verbatim. This is very important for running the test suite, since there are quite a few commands there that explicitly invoke /bin/sh and require Bourne shell functionality. > I'd prefer to have it higher in the stack, perhaps in the Scheme code in > (ice-9 popen) or maybe even in the application code. I don't think this is practical: too many places to change. And you will have to keep an eye on it from now to eternity. It's a losing battle. > > + for (i = 0; extensions[i]; i++) > > + { > > + abs_namelen = SearchPath (path, program, extensions[i], > > + MAX_PATH, abs_name, NULL); > > + if (0 < abs_namelen && abs_namelen <= MAX_PATH) /* found! */ > > + break; > > + } > > This search is not done correctly. If I ask to run program "foo", and > "foo.cmd" comes early in the path and "foo.exe" comes late in the path, > it should be "foo.cmd" that gets run, not "foo.exe". No, the search order is exactly the one used by the Windows shell. A .exe program is invoked in preference to a .cmd batch file. The only deviation is that I've put .com last, whereas it is actually the first. But I believe a .com file nowadays is much more likely to be a VMS command file than a Windows executable. > > + /* If they asked for /bin/sh and we didn't find it, fall back on the > > + default Windows shell. */ > > + if (abs_namelen <= 0 && bin_sh_requested) > > + { > > + const char *shell = getenv ("ComSpec"); > > + > > + if (!shell) > > + shell = "C:\\Windows\\system32\\cmd.exe"; > > + > > + *bin_sh_replaced = 1; > > + strcpy (abs_name, shell); > > + abs_namelen = strlen (abs_name); > > + } > > Again, I'd prefer to put this at a higher level in the stack. And I again think that this is impractical to place that burden on higher levels. > If the user specifically asks to run "/bin/sh", we should not > silently substitute a different incompatible shell. If the command is incompatible with cmd, it will fail anyway. But if it is a simple command, it is likely to work with cmd as well, so this is better than always failing. People who work on Unix will use /bin/sh without much thought (what else can they do when they need a shell?), so if we see it, it doesn't necessarily mean they must have a Bourne shell. > > +/* Concatenate command-line arguments in argv[] into a single > > + command-line string, while quoting arguments as needed. The result > > + is malloc'ed. */ > > +static char * > > +prepare_cmdline (const char *cmd, const char * const *argv, int > > bin_sh_replaced) > > +{ > > + /* These characters should include anything that is special to _any_ > > + program, including both Windows and Unixy shells, and the > > + widlcard expansion in startup code of a typical Windows app. */ > > + const char need_quotes[] = " \t#;\"\'*?[]&|<>(){}$`^"; > > Hmm. This seems unlikely to be correct in the general case. If the > program being run is not a command shell, then I wouldn't expect it to > process backslash escapes at all. That's what happens on Unix. But not on Windows, where the quotes are handled by the startup code that processes the command-line string into the argv[] array. With a few exceptions (cmd.exe being one of them), every Windows program processes quotes and wildcards the same. > Is there any way to launch a standard C program on Windows where the > individual 'argv' arguments are passed as separate strings, without > having to combine them together with quoting and escapes? No. The low-level API to launch a program accepts a single string as the command line. It is processed into argv[] by the C library startup code. > If a standard C program is compiled using Mingw, precisely how is this > combined string converted into that program's 'argv' argument? By the startup code that is part of the C runtime. > How can we ensure that those strings are passed reliably and > losslessly, without relying on heuristics that only work some of the > time? This is not heuristics, this is the official documented way to quote and escape quotes on the command line. Working by those rules will not cause any losses. > > + /* Are we constructing a command line for cmd.exe? */ > > + if (bin_sh_replaced) > > + cmd_exe_quoting = 1; > > + else > > + { > > + for (p = cmd + strlen (cmd); > > + p > cmd && p[-1] != '/' && p[-1] != '\\' && p[-1] != ':'; > > + p--) > > + ; > > + if (c_strcasecmp (p, "cmd.exe") == 0 > > + || c_strcasecmp (p, "cmd") == 0) > > + cmd_exe_quoting = 1; > > + } > > If there's really no way to avoid quoting, then probably these > heuristics should be moved into 'lookup_cmd', and 'lookup_cmd' should > return the 'cmd_exe_quoting' flag instead of the 'bin_sh_replaced' flag, > so that it will do the right thing if the higher levels ask for > 'cmd.exe' directly. The design is that lookup_cmd searches for the command, and the quoting is done elsewhere. I don't understand why it matters how these tasks are subdivided between functions, but if it does, I submit that the way I subdivided them is more modular, as each function does a single well-defined job. > > + /* cmd.exe needs a different style of quoting: all the arguments > > + beyond the /c switch are enclosed in an extra pair of quotes, > > + and not otherwise quoted/escaped. */ > > Again, I think this should be handled at a higher level in the stack. > It would be good to have primitives at the C level that don't include > such unreliable heuristics. Sorry, I don't understand: what heuristics? The way cmd.exe handles quotes is well documented, and the code implements those rules. There's no heuristics here, anywhere. > > + /* Gnulib's 'pipe' opens the pipe in binary mode, but we don't > > + want to read text-mode input of subprocesses in binary more, > > + because then we will get the ^M (a.k.a. "CR") characters we > > + don't expect. */ > > + _setmode (c2p[0], _O_TEXT); > > + } > > We should do the newline conversion elsewhere in Guile's I/O stack, so > that it is possible to do binary I/O with subprocesses. That's fine by me, but the default should still be text I/O. Once there is a way to propagate the text/binary mode to this level, a condition can be added not to do the above. IOW, this is something for future changes; for now, the vast majority of pipes need text-mode I/O, so leaving this at binary will break much more code than using text I/O. > > + /* Posix requires to call the shell if execvp fails to invoke EXEC_FILE. > > */ > > + if (errno_save == ENOEXEC || errno_save == ENOENT) > > + { > > + const char *shell = getenv ("ComSpec"); > > + > > + if (!shell) > > + shell = "cmd.exe"; > > + > > + if (c_strcasecmp (exec_file, shell) != 0) > > + { > > + argv[0] = (char *)exec_file; > > + return start_child (shell, argv, reading, c2p, writing, p2c, errfd); > > + } > > + } > > I think this fallback shouldn't be done. We don't do it in the POSIX > version of scm_open_process, and we shouldn't do it here either. AFAIK, on Posix platforms this is done by the library or the kernel implementation of execvp. > What exactly does POSIX require? Can you provide a reference? See http://pubs.opengroup.org/onlinepubs/009695399/functions/exec.html. Quote: In the cases where the other members of the exec family of functions would fail and set errno to [ENOEXEC], the execlp() and execvp() functions shall execute a command interpreter and the environment of the executed command shall be as if the process invoked the sh utility using execl() as follows: execl(<shell path>, arg0, file, arg1, ..., (char *)0); where <shell path> is an unspecified pathname for the sh utility, file is the process image file, and for execvp(), where arg0, arg1, and so on correspond to the values passed to execvp() in argv[0], argv[1], and so on.