On Fri, 19 Jan 2007, Steven D'Aprano wrote: > On Fri, 19 Jan 2007 12:22:04 +1100, Ben Finney wrote: > >> tubby <[EMAIL PROTECTED]> writes: >> >>> Silly question, but here goes... what's a good way to determine >>> when a file is an Open Office document? I could look at the file >>> extension, but it seems there would be a better way. >> <snip> >> The Unix 'file' command determines the type of a file by its >> contents, not its name. This functionality is essentially a >> database of "magic" byte patterns mapping to file types, > > Ah, another lousy, unreliable way to make a definite statement about > the actual contents of a file. Looking at magic bytes inside a file > is hardly bullet-proof (although file seems to be moderately > reliable in practice, at least under Linux). > > Simple example: is the file consisting of two bytes "x09x0A" meant > to be a text file with a tab and a newline, or a binary file > consisting of a single two-byte int? There's no way to tell just > from the contents.
And see for example the problem that development versions of emacs is (were?) having with C files that started #define and were then treated as graphics files! http://thread.gmane.org/gmane.emacs.devel/64823/focus=65228 Robert -- La grenouille songe..dans son château d'eau Links and things http://rmstar.blogspot.com/ -- http://mail.python.org/mailman/listinfo/python-list