On 2006-08-02 17:36:06, Sybren Stuvel wrote: > IMO it's too bad that "they" chose \r\n as the standard. Having two > bytes as the end of line marker makes sense on typewriters and > similarly operating printing equipment.
I may well be mistaken, but I think at the time they set that standard, such equipment was still in use. So it may have been a consideration. > Nowadays, I think having a single byte as the EOL maker is quite a bit > clearer. Rather than thinking in bytes and the like when inserting an EOL marker, inserting really an EOL marker (that then gets translated by low level code to the appropriate byte sequence as needed) is probably the less archaic way to do that :) > On the other hand, with the use of UTF-8 encodings and the like, the > byte-to-character mapping is gone anyway, so perhaps I should just get > used to it ;-) Yes :) "Bytes" is getting definitely too low level. Especially with higher level languages like Python... there are not many byte manipulation facilities anyway. The language is at a much higher level, and in that sense the classic strings are a bit out of line, it seems. >> Just as for MS there are good reasons not to "fix" the backslash now > > Which are those reasons, except for backward compatability? I don't know how many reasons you need besides backward compatibility, but all the DOS (still around!) and Windows apps that would break... ?!? I think breaking that compatibility would be more expensive than the whole Y2k bug story. And don't be fooled... you may run a Linux system, but you'd pay your share of that bill anyway. > Less FAQs in this group about people putting tabs, newlines and other > characters in their filenames because they forget to escape their > backslashes? Or forget to use raw strings. (If you don't want it to be escaped, please say so :) But similar as I wrote above with the EOL thing, I think that the whole backslash escape character story is not quite well-chosen. In a way, this a mere C compatibility pain in the neck... (Of course there are implementation and efficiency reasons, mainly because Python is based on C APIs, but all that is as arbitrary as the selection of the backslash as path separator.) There could be other solutions (in Python, I mean). Only accept raw strings in APIs that deal with paths? Force coders to create paths as objects, in a portable way, maybe by removing the possibility to create paths from strings that are more than one level in the path? Or introduce a Unicode character that means "portable path separator"? Or whatever... :) > Strings and filenames are usually tightly coupled in any program > handing files, though. Yes, and that's IMO something from way below in the implementation depths. While file names and paths are strings, not every string is a valid and useful file name or path. This shows that using strings for file names and paths has tradition (coming from low level languages like C), but IMO is not quite appropriate for a higher abstraction level. > Almost every programming language I know of uses it as the escape > character, except for perhaps VB Script and the likes. Not sure about > the different assembly languages, though. There are so many languages... and I know so few of them... http://en.wikipedia.org/wiki/Category:Programming_languages Now it may be predominant (I still think it's mostly present in languages that are in some way influenced by C), but in the 70ies? IIRC, Pascal uses '^' for a similar purpose (not quite the same, but similar). This form is still in ample use in documentation to mean "Ctrl-<char>"; probably much more common than the backslash notation. > Sure. I've talked more about this specific subject in this thread than > in the rest of my life ;-) There's a first for everything :) > I think cooperation and uniformity can be a very good thing. On the other > hand, Microsoft want the software written for their platform to stay on > their platform. That's probably one of the major reasons to remain > incompatible with other systems. Probably. But even if I'd had a say there (and I hate switching between separator characters just as much as the next guy, and possibly do so more than you given that I work on a Windows system, with slashes in repository paths and URIs), I'm not sure I'd make the jump away from the backslash as path separator. That's just breaking too much code. You don't want to have all these curses directed at you... Gerhard -- http://mail.python.org/mailman/listinfo/python-list