Larry Hastings <la...@hastings.org> added the comment: Here's my first pass at a patch. For this patch, I took the proposal to its logical extreme: I removed every function in os that was both mildly redundant with an existing function *and* has been added since 3.2, and moved that functionality to the equivalent existing function, making it accessible with the use of keyword-only parameters.
Specifically: This function has been removed, and instead | this parameter has been added to | | this function | | | v v v ------------------------------------- faccessat dir_fd access faccessat effective_ids access faccessat follow_symlinks access fchmodat dir_fd chmod fchmodat follow_symlinks chmod fchownat dir_fd chown fchownat follow_symlinks chown fexecve fd execve fgetxattr fd getxattr flistdir fd listdir flistxattr fd listxattr fremovexattr fd removexattr fsetxattr fd setxattr fstatat dir_fd stat futimens fd utime futimes fd utime futimesat dir_fd utime lgetxattr follow_symlinks getxattr linkat dst_dir_fd link linkat src_dir_fd link linkat follow_symlinks link llistxattr follow_symlinks listxattr lremovexattr follow_symlinks removexattr lsetxattr follow_symlinks setxattr lutimes follow_symlinks utime mkdirat dir_fd mkdir mkfifoat dir_fd mkfifoat mknodat dir_fd mknod open dir_fd openat readlinkat dir_fd readlink renameat dst_dir_fd rename renameat src_dir_fd rename symlinkat dir_fd symlink unlinkat dir_fd unlink unlinkat remove_directory unlink utimensat dir_fd utime utimensat follow_symlinks utime Additionally, we *could* deprecate this function, | as I have added this parameter | | to this function: | | | v v v -------------------------------------- fchdir fd chdir fchmod fd chmod fstat fd stat fstatvfs fd statvfs lchflags follow_symlinks chflags lchmod follow_symlinks chmod fchown fd chown lchown follow_symlinks chown lstat follow_symlinks stat I doubt we'll ever deprecate those functions. This patch does not deprecate those functions. I don't propose deprecating those functions. Notes: * What do you do on platforms where the functionality isn't available? I believe it's better to always accept parameters, but throw a NotImplementedError if the functionality they represent is unavailable on the current platform. Adding and removing parameters based on the current platform... that way lies madness. (It's like scrollbars that pop in and out of existance depending on whether or not you need them. Better that the scrollbars are always there, and simply disabled when the content fits in the current window. Users like a stable foundation under their feet.) * The patch is... pretty big. But you can divide-and-conquer it into a series of self-contained parts. First I add path_converter, then I modify existing functions / remove new functions. Each of those can be reviewed in isolation. Also, the new implementations generally follow the same pattern: initialize call PyArg_ParseTupleAndKeywords error out early if user asks for functionality unavailable on the current platform ensure that combinations of parameters (dir_fd, fd, follow_symlinks) are permitted do actual work, turning on/off advanced functionality based on configure ifdefs (HAVE_FCHOWNAT etc) if error, raise exception compute return value exit: cleanup all path variables return return_value Here's a list of all the functions I added arguments to: access chdir chflags chmod chown execve getxattr link listdir listxattr mkdir mkfifo mknod open readlink removexattr rename setxattr stat statvfs symlink unlink utime * The one new bit of technology: a PyArg_ParseTuple "converter" function called path_converter. Here's its documentation: /* * A PyArg_ParseTuple "converter" function * that handles filesystem paths in the manner * preferred by the os module. * * path_converter accepts (Unicode) strings and their * subclasses, and bytes and their subclasses. What * it does with the argument depends on the platform: * * * On Windows, if we get a (Unicode) string we * extract the wchar_t * and return it; if we get * bytes we extract the char * and return that. * * * On all other platforms, strings are encoded * to bytes using PyUnicode_FSConverter, then we * extract the char * from the bytes object and * return that. * * Input fields: * path.nullable * If nonzero, the path is permitted to be None. * path.function_name * If non-NULL, path_converter will use that as the name * of the function in error messages. * (If path.argument_name is NULL it omits the function name.) * path.argument_name * If non-NULL, path_converter will use that as the name * of the parameter in error messages. * (If path.argument_name is NULL it uses "path".) * * Output fields: * path.wide * Points to the path if it was expressed as Unicode * and was not encoded. (Only used on Windows.) * path.narrow * Points to the path if it was expressed as bytes, * or it was Unicode and was encoded to bytes. * path.length * The length of the path in characters. * path.object * The original object passed in. * path.cleanup * For internal use only. May point to a temporary object. * (Pay no attention to the man behind the curtain.) * * At most one of path.wide or path.narrow will be non-NULL. * If path was None and path.nullable was set, * both path.wide and path.narrow will be NULL, * and path.length will be 0. * * path_converter takes care to not write to the path_t * unless it's successful. However it must reset the * "cleanup" field each time it's called. * * Use as follows: * path_t path; * memset(&path, 0, sizeof(path)); * PyArg_ParseTuple(args, "O&", path_converter, &path); * // ... use values from path ... * path_cleanup(&path); * * (Note that if PyArg_Parse fails you don't need to call * path_cleanup(). However it is safe to do so.) */ typedef struct { char *function_name; char *argument_name; int nullable; wchar_t *wide; char *narrow; Py_ssize_t length; PyObject *object; PyObject *cleanup; } path_t; I assert path_converter is Very Useful. It nearly always reduced the argument processing from three code paths (two for Windows, one for everyone else) to one. Even if the rest of the patch isn't accepted I'm sure we'll keep path_converter. * In a lot of places I combined together several functions, or several large blobs of #ifdef'd code, into one large slightly-shaggy function. Like, Windows code is now often streamlined in, instead of being a big separate #ifdef. And instead of three passes at decoding arguments (one for Windows wide, one for Windows narrow, one for everyone else) there is generally just one. (Thanks, path_converter!) Again, even if we don't keep the extra keyword arguments / functionality, I'm hoping at least some of that cleanup will survive. * utime is the whale. It obviates *five* other functions (futimens, futimes, futimesat, lutimes, and utimensat). I spent a fair amount of time refactoring it. It's so complex, it was hard to make it anything like readable. I like to think I reached a local maximum of readability given what I had to work with, and I like to think that the resulting dog's breakfast is less unreadable than the previous iteration. Anyway I'm definitely open to suggestions on how to restructure it further to enhance readability. * The tests used to use e.g. hasattr(os, 'futimes') and the like to determine what functionality was available. In order to LBYL, I had to resort to e.g. sysconfig.get_config_var('HAVE_FUTIMES'). If users need to LBYL here too, well, it's kind of a problem. (I hope function signature objects solve this--we're talking about it.) * On a very related topic: os.openat() is one of the functions I removed. And then we have this in Lib/os.py: if _exists('openat'): def fwalk(...) Since I removed os.openat completely, we could longer tell whether or not the functionality is available. And I couldn't use my sysconfig.get_config_var trick because it's os.py and we're bootstrapping the interpreter. So I went with a dirty hack--uh, I mean, practicality beats purity. posixmodule.c conditionally exposes a symbol called _HAVE_OPENAT if openat() is available, and fwalk() is now gated on that. * A minor semantic change: symlink() now accepts the same arguments everywhere. Previously it had an extra optional argument only on Windows ("target_is_directory"). Non-Windows now accepts that too and ignores it, in much the same way that os.mkdir ignores the mode on Windows. Also, os.symlink now accepts byte strings for the paths on Windows. (Even though we'll remove byte string paths in 3.4, yes?) * The patch is still sloppy: * The docstrings are incomplete. * I haven't touched the docs at all. * There are > 80 col lines. Clearly I'll clean up this stuff if the patch has a shot at going in. I do think it's an improvement. But I won't check any part of it in without some consensus (or BFDL ruling). I look forward to your feedback! ---------- assignee: -> larry keywords: +patch stage: -> patch review Added file: http://bugs.python.org/file25716/larry.os.keyword.arguments.collapse.1.diff _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue14626> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com