On Fri, Oct 21, 2011 at 01:20:49PM +0200, Bert Huijben wrote:
>
>
> > -----Original Message-----
> > From: Daniel Shahaf [mailto:[email protected]]
> > Sent: vrijdag 21 oktober 2011 13:13
> > To: Tomáš Bihary
> > Cc: [email protected]
> > Subject: Re: problems with mimetype of and empty utf8 files in svn 1.7
> >
> > What do you expect to happen?
> >
> > As to special-casing svn_io_is_binary_data() to handle 0xEFBBBF
> > correctly... we could do that, I suppose.
>
> +1
> AnkhSVN currently has its own code to remove the binary marking from these
> specific files.
>
> Some 3th party Visual Studio features like to add empty files and then later
> fill them with the real data.
>
>
>
> This leaves the case where you have just a few characters in a file where you
> have a BOM at the start, but for our users that case is far less common than
> this empty file case.
>
>
> Bert
Fine, here is my patch again, with a log message.
Can someone run this through the windows test suite, please? Thanks.
I don't expect any test failures from this to arise on *nix.
Manual testing on BSD with files that contain just the UTF-8 BOM
suggests that the patch works fine.
[[[
Special-case empty UTF-8 files which have a UTF-8 BOM. Prevents such
files from being considered binary by default.
* subversion/libsvn_subr/io.c
(svn_io_detect_mimetype2): If the block read from disk contains only
a UTF-8 BOM, don't return a binary mimetype but indicate to the caller
that it should be treated as text.
Reported by: Tomáš Bihary
]]]
Index: subversion/libsvn_subr/io.c
===================================================================
--- subversion/libsvn_subr/io.c (revision 1186983)
+++ subversion/libsvn_subr/io.c (working copy)
@@ -2968,6 +2968,13 @@ svn_io_detect_mimetype2(const char **mimetype,
/* Now close the file. No use keeping it open any more. */
SVN_ERR(svn_io_file_close(fh, pool));
+ if (amt_read == 3 && block[0] == 0xEF && block[1] == 0xBB && block[2] ==
0xBF)
+ {
+ /* This is an empty UTF-8 file which only contains the UTF-8 BOM.
+ * Treat it as plain text. */
+ return SVN_NO_ERROR;
+ }
+
if (svn_io_is_binary_data(block, amt_read))
*mimetype = generic_binary;