> So, I set utf8only=on and try to create a file with a > filename that is > a byte array that can't be decoded to text using > UTF-8. What's supposed > to happen? Should fopen(), or whatever syscall > 'touch' uses, fail? > Should the syscall somehow escape utf8-incompatible > bytes, or maybe > replace them with ?s or somesuch? Or should it > automatically convert the > filename from the active locale's fs-encoding > (LC_CTYPE?) to UTF-8?
First, utf8only can AFAIK only be set when a filesystem is created. Second, "use the source, Luke:" http://src.opensolaris.org/source/search?q=&defs=&refs=z_utf8&path=%2Fonnv%2Fonnv-gate%2Fusr%2Fsrc%2Futs%2Fcommon%2Ffs%2Fzfs%2Fzfs_vnops.c&hist=&project=%2Fonnv Looks to me like lookups, file create, directory create, creating symlinks, and creating hard links will all fail with error EILSEQ ("Illegal byte sequence") if utf8only is enabled and they are presented with a name that is not valid UTF-8. Thus, on a filesystem where it is enabled (since creation), no such names can be created or would ever be there to be found anyway. So in that case, the system is refusing non UTF-8 compatible byte strings and there's no need to escape anything. Further, your last sentence suggests that you might hold the incorrect idea that the kernel knows or cares what locale an application is running in: it does not. Nor indeed does the kernel know about environment variables at all, except as the third argument passed to execve(2); it doesn't interpret them, or even validate that they are of the usual name=value form, they're typically handled pretty much the same as the command line args, and the only illusion of magic is that with the more widely used variants of exec that don't explicitly pass the environment, they internally call execve(2) with the external variable environ as the last arg, thus passing the environment automatically. There have been Unix-like OSs that make the environment available to additional system calls (give or take what's a true system call in the example I'm thinking of, namely variant links (symlinks with embedded environment variable references) in the now defunct Apollo Domain/OS), but AFAIK, that's not the case in those that are part of the historical Unix source lineage. (I have no idea off the top of my head whether or not Linux, or oddballs like OSF/1 might make environment variables implicitly available to syscalls other than execve(2).) This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss