On Saturday, June 8, 2002, at 08:13 , David T-G wrote: > drieux, et al -- > ...and then drieux said... > % On Saturday, June 8, 2002, at 04:47 , David T-G wrote: [..] > % > ... > % > chop ; chop ; # strip \n\r (no chomp here) > ... > % > % that is way too weird - since that means that what > % ever dos app you were using was not complying with > % the standard to begin with. > > Tell me about the standard... Should perl happily chomp either a UNIX or > a DOS (or even a MAC) line? Or do I turn around and explain it below, > answering myself?
the cannon is: EOL - end of line is denoted as mac: <CR> : chr(13) dos: <CR><NL> : chr(13)chr(10) nix: <NL> : chr(10) note what happens: vladimir: 64:] echo line> file vladimir: 65:] unix2dos file file.dox could not open /dev/kbd to get keyboard type US keyboard assumed could not get keyboard type US keyboard assumed vladimir: 66:] od -c !$ od -c file.dox 0000000 l i n e \r \n 0000006 vladimir: 67:] if you check the stty man pages you will find our friend onlcr that does the mapping of NL to CR-NL - we still have the old cross over problem here that what unix folks use as \n is the "new line" token - but which by way of stty goes out to their 'terminal type' as if it were CR - or "\r" - return the carriage head to the beginning of the line and then shift the roller up one. otherwise if you have merely the new line you start typing here. If you have merely the CR - you would start writing over the line. Hence to have "\n\r" would mean having implemented the stardard for the EOL token to the file 'underappropriately' - although 'technically literally' and it would 'still work' in the case of those systems that know how to parse them correctly. Since it really does not matter to a teletype which order the commands are generated - they will read them off the wire as commands and execute them... { note you should seend three BEL tokens for the start and stop of any message - but that has fallen out of habit.... and no one seems to worry about taking them out of the data stream, or remembering to put them in either... } [..] > (you know, it can be a real challlenge to write a one-liner!) and found > that I have either RL or L for all files, and no \n\r as I had thought, [..] the problem here is that chomp is defined on the host you are on, not on the host where you once were..... it's a reasonable compromise in that case... where you have to get your poop in a group on this point is as you move into 'network layer plays' - such as HTTP - unless you are using the appropriate modules to do this stuff for you - and you find that the RFC for http defines the separator for the head from the body as <CR><LF> - cf: http://www.w3.org/Protocols/rfc2068/rfc2068 section 2.2 to be specific - where they call out the decimal values for them in the ASCII table.... { may I recommend that you use the CPAN modules - hand cranking this stuff from the IO::Socket layer - while what some of us did, is not what I would recommend now.... but yes, the original code I ripped had the sort of 'oh look, we have that <CR><LF> hence we are out of header and the rest is body....' sort of coding...} [..] > this would have really screwed me as I got way down into my lists :-) yes... not that I would wish to impose some 'puritanical morality' on how you relate to yourself..... but in the coding space, I would wish to impose a sense of THAT WILL HURT YOU! > So now I should be able to put > > ... > while(<>) > { > s/($cr|$lf)+//; > ... > > into my code and basically make my own chomp, right? Time to go off and > test... test that - but I do not think it will do what you are expecting, since I think the tradition is ([$cr|$lf]+) where the [ ] block off the sequence of characters, the "|" here is the expected 'or me' - the "+" denoting one or more of these. { in this case you want that, helps the compiler not worry about looking for the cases of 'not me' and we nest all of that in the round braces to denote the 'yes, this pattern, do something with it!' [..] ciao drieux --- -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]