On 2010-10-15, Martin Gregorie <mar...@address-in-sig.invalid> wrote: >> On 2010-10-15, Martin Gregorie <mar...@address-in-sig.invalid> wrote: >>> On Fri, 15 Oct 2010 17:02:07 +0000, Grant Edwards wrote: >>>> On 2010-10-15, Steven D'Aprano <st...@remove-this-cybersource.com.au>: >>>> >>>>> In the Unix world, which includes OS X, text tools tend to have >>>>> difficulty with tabs. Or try naming a file with a newline or carriage >>>>> return in the file name, or a NULL byte. >>>> >>>> How do you create a file with a name that contains a NULL byte? >>> >>> Use a language or program that doesn't use null-terminated strings. >>> >>> Its quite easy in many BASICs, [...] >> >> I don't see what the in-program string representation has to do with >> it. The Unix system calls that create files only accept NULL >> terminated strings for the path parameter. > > Well, obviously you can't have null in a filename if the program is > using null-terminated strings.
Obviously. Just as obviously, you can't have a null in a filename if the OS filesystem API uses null-terminated strings -- which the Linux filesystem API does. I just verified that by looking at the kernel sources -- I can post the relevent code if you like. I'm pretty sure all the other Unices are the same. I've got BSD sources laying around somewhere... >> Are you saying that there are BASIC implementations for Unix that >> create Unix files by directly accessing the disk rather than using >> the Unix system calls? > > I'm saying that the only BASIC implementations I've looked at the > guts of have used count-delimited strings. None were on *nixen but > its a safe bet that if they were ported to a UNIX they'd retain their > count-delimited nature. And I'm saying _that_doesn't_matter_. The _OS_ uses NULL-terminated strings. You can use a language the represents strings as braille images encoded as in-memory PNG files if you want. That still doesn't let you create a Unix file whose name contains a NULL byte. > Another language that will certainly do this is COBOL, which only > uses fixed length, and therefore undelimited, strings. Again, what difference does it make? If the OS uses null-terminated strings for filenames, what difference does it make how the user-space program represents filenames internally? > The point I'm making is that in both fixed length and counted string > representations you can put any character value at all into the > string unless whatever mechanism you're using to read in the values > recognises something, i.e. TAB, CR, LF, CRLF as a delimiter, and even > then the program can generate a string containing arbitrary > gibberish. I don't care how the program represents strings. The OS doesn't care. The filesystem doesn't care. Please explain how to pass a filename containing a NULL byte to a Unix syscall like creat() or open(). You don't even have to use the C library API -- feel free to use the real syscall API for whatever Unix on whatever architecture you want. > If you then use the string as a file name you can end up with a file > that can't be accessed or deleted if the name flouts the OS's file > naming conventions. I've done it in the past with BASIC programs and > finger trouble under FLEX09 and CP/M. In both cases I had to use a > disk editor to fix the file name before the file could be deleted or > accessed. We're talking about Unix. We're not talking about CP/M, DOS, RSX-11m, Apple-SOS, etc. -- Grant Edwards grant.b.edwards Yow! I put aside my copy at of "BOWLING WORLD" and gmail.com think about GUN CONTROL legislation... -- http://mail.python.org/mailman/listinfo/python-list