On Wed, Nov 18, 2009 at 10:37:01AM +0100, Daniel Näslund wrote: > Hi! > > [[[ > Do not display a binary property diff in the interactive conflict > resolver. > > * subversion/include/svn_io.h > (svn_stream_detect_binary_mimetype): New. Declare. > > * subversion/libsvn_subr/stream.c > (svn_stream_detect_binary_mimetype): Check if atleast 85 % of the > bytes fall within the ascii range of printable characters. > > * subversion/libsvn_wc/props.c > (maybe_generate_propconflict): Call ..binary_mimetype() on the > mergestream. Set the conflict description according to the outcome. > > * subversion/svn/conflict-callbacks.c > (svn_cl__conflict_handler): Print err msg if desc->is_binary when > invoking 'diff-full'. > > Patch by: Daniel Näslund <daniel{_AT_}longitudo.com> > ]]]
svn_stream_detect_binary_mimetype() checked if more than 85 % of the content was "binary", then it set a the mimetype to "application/octetstream". This patch only checked if the mergestream was binary. Since the mergestream contains both diff headers and hunk symbols, small binary content was falsely said to be positive. Furthermore it used a detection algorithm that considered all characters outside the ascii range to be binary. When testing with chinese texts the resolver said that the content was binary. Furthermore, since it only checked the first kb of the mergestream it would not detect a situation where only the last part of the mergestream was binary. The two last bugs should also be a problem in svn_io_detect_mimetype2() from which I extracted the algorithm for detecting binary content. That function is used in the autoprop stuff. In the end: How often do we set a binary property? I'm starting to believe that I'm trying to fix something that doesn't need fixing. Since a property is generally small we could still get a useful diff in some cases where only one of the files is binary. If we escaped the control chars then there would be no harm. /Daniel