And here's why I was investigating cygutils. I found that d2u wasn't working on a file of mine. Let me demonstrate:
-------snip------- [EMAIL PROTECTED] /davek/d2utest> ls -la total 3 drwxr-xr-x+ 2 dk Domain U 0 Apr 2 16:41 . drwx------+ 29 dk Domain U 0 Apr 1 12:38 .. -rw-r--r-- 1 dk Domain U 2902 Mar 31 17:01 stdprint.c [EMAIL PROTECTED] /davek/d2utest> cp stdprint.c stdprint1.c [EMAIL PROTECTED] /davek/d2utest> d2u stdprint1.c stdprint1.c: done. [EMAIL PROTECTED] /davek/d2utest> ls -la total 6 drwxr-xr-x+ 2 dk Domain U 0 Apr 2 16:41 . drwx------+ 29 dk Domain U 0 Apr 1 12:38 .. -rw-r--r-- 1 dk Domain U 2902 Mar 31 17:01 stdprint.c -rw-r--r-- 1 dk Domain U 2902 Apr 2 16:41 stdprint1.c [EMAIL PROTECTED] /davek/d2utest> cat stdprint.c | tr -d '\015' |cat >stdprint2.c [EMAIL PROTECTED] /davek/d2utest> ls -la total 9 drwxr-xr-x+ 2 dk Domain U 0 Apr 2 16:41 . drwx------+ 29 dk Domain U 0 Apr 1 12:38 .. -rw-r--r-- 1 dk Domain U 2902 Mar 31 17:01 stdprint.c -rw-r--r-- 1 dk Domain U 2902 Apr 2 16:41 stdprint1.c -rw-r--r-- 1 dk Domain U 2897 Apr 2 16:41 stdprint2.c -------snip------- I was pretty stunned to find d2u didn't have the same effect as tr -d. A few seconds work in the debugger, however, made it clear. Right inside conv.c, in the main convert (...) function, there's an attempted optimisation. After opening the file for conversion, it reads a char at a time until it finds the first '\n' or '\r' in the whole file. If a '\n' comes first, it assumes the file is in Unix format; if a '\r' comes first, it assumes the file must be in DOS format. Now, these assumptions are reasonable enough ways of guessing the file format if it hasn't been specified by the command name or command line switch, and therefore of deducing which kind of translation is required. But then it checks to see if the guessed format matches the format you've asked it to convert into. If so, it attempts to 'optimise' the conversion by simply not performing it: it closes the file and leaves it untouched. Unfortunately, there is an extra unstated assumption in between deducing the file type from the first EOL in the file and deducing that you don't need to perform a conversion, and that assumption is that every other line in the file has the same EOL as the first line. And that assumption is bogus, and it means that d2u/u2d and friends are no use on files which have mixed EOL types, unless by good chance the very first line has the EOL type that you wish to convert away from. My attached patch simply removes the attempted optimisation. Like I say, I think it's an invalid shortcut to assume that every line in a file has the same EOL type. I could imagine a case could be made for keeping the 'optimisation' and perhaps providing a command-line switch "-f" or "--force" to force full processing of files even if they seem to already be in the right mode; OTOH I'd say that even if you wanted to keep the optimisation in some cases, it's a dangerous optimisation that can lead to incorrect output, and therefore it should only be switched on when the user deliberately adds a command-line option, rather than being on by default and disableable. cheers, DaveK -- Can't think of a witty .sigline today....
conv-patch.diff
Description: Binary data
-- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/