On Fri, Oct 21, 2011 at 11:15:19AM +0200, Tomáš Bihary wrote: > Hello, > > after upgrading the svn 1.7 I realized an issue. > > When I add an empty UTF-8 file, it is added with mimetype > application/octet-stream > The empty UTF8 file has 3 bytes size - there are just the 3 mark > bytes 0xEF 0xBB 0xBF
The same happened in svn 1.6 though, didn't it? We could special-case the UTF-8 BOM in the mime-type detection logic. See the patch below. Can you try this patch? Note that the UTF-8 BOM is, as far as I know, only used by Windows. So this problem should only happen there. > If there is an content in that file, it is handled correctly like text. > > I've found a similar issue 2194 with UTF-16 files which status is REOPENED. UTF-16 files are always marked binary at the moment. This is because the internal diff and merge functionality does not understand UTF-16. As a workaround, you can configure an external diff/merge tool that understands UTF-16 files: http://svnbook.red-bean.com/nightly/en/svn.advanced.externaldifftools.html Index: subversion/libsvn_subr/io.c =================================================================== --- subversion/libsvn_subr/io.c (revision 1186983) +++ subversion/libsvn_subr/io.c (working copy) @@ -2968,6 +2968,13 @@ svn_io_detect_mimetype2(const char **mimetype, /* Now close the file. No use keeping it open any more. */ SVN_ERR(svn_io_file_close(fh, pool)); + if (amt_read == 3 && block[0] == 0xEF && block[1] == 0xBB && block[2] == 0xBF) + { + /* This is an empty UTF-8 file which only contains the UTF-8 BOM. + * Treat it as plain text. */ + return SVN_NO_ERROR; + } + if (svn_io_is_binary_data(block, amt_read)) *mimetype = generic_binary;