> 3.243 Pathname > [...] <slash> [...]
> <slash> is posix speak for '/' But is that "Unicode codepoint 47" or "ASCII codepoint 0x2f" or "whatever the character set in use provides that is a line between upper right and lower left" or what? Does POSIX mandate an ASCII superset, for example? C99 demands that certain characters be present, but I don't think it mandates anything about their representations except that they must all be strictly positive and the digits 0..9 are consecutive and in order. Hence asking if POSIX mandates an ASCII superset. > And: > 3.141 Filename > A sequence of bytes consisting of 1 to {NAME_MAX} bytes used to > name a file. The bytes composing the name shall not contain the <NUL> > or <slash> characters. [...] I think for some character sets that may be ill-defined, and it definitely contradicts existing practice (which is that the octet string shall not contain 0x00 or 0x2f octets, regardless of what characters they may or may not be part of). Perhaps that's just sloppy language in the spec, but perhaps not, too. >> For example, what happens if you find that you have both, say, ls.0 >> and %6Cs.0 in a cat1/ directory somewhere? > Obviously, whenever one picks a character to have special meaning, > there needs to be a way to encode that character, even though it > looks like it could just be stored literally, so if there was an > encoding scheme like that, a filename like "%6Cs.0" would be encoded > as %256Cs.0 (or something). No, that's not what I mean. I mean, `you find you have a directory entry whose d_name contains "ls.0" and another which contains "%6Cs.0"', not `you want to have both a manpage called "ls" and another manpage called "%6Cs"'. (Or, perhaps harder to handle, one cojntaining %6Cs.0 and one containing l%73.0.) The point is, man(1) has to find the underlying file. But, when you have encodings, you have multiple possible names. In most cases, there will be 2^N possible names, where N is the number of characters (or possibly octets) in the name, fewer if any of the characters/octets _must_ be encoded, such as / or %. So, either it has to try all possible encodings (which will be impractically large; for example, XtDisplayStringConversionWarning would generate 16 (binary) billion different names) or it has to read the directory and look for a name that, after decoding, matches. I was assuming the latter case and pointing out that there's the question of what to do if you find multiple different names that decode to matches. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B