On Mon, Jul 30, 2012 at 1:44 AM, 김성훈2 [duinggul] <duing...@nexon.co.kr> wrote: > Hi, I’m using Subversion everyday. :-) > > Recently, I converted native text files ( in my company’s project repository > ) to UTF-16 files. > > The problem was that Subversion library does not support ‘blame of UTF-16 > file’ currently. > > So I checked out Subversion’s source code > > And modified ‘libsvn_client\blame.c’ file to support ‘blame of UTF-16 file’. > > Brief idea of implementation is : > > Export current file ( in svn temp directory ) to UTF-8 file if current > file’s format is UTF-16, > > shortly before processing blame for current file. > > It’s a temporary implementation rather than formal implementation, > > But I think the implementation can be used temporarily before formal > implementation of ‘blame of UTF-16 file’ is made. :-) > > So I attach ‘blame.c’ file for reference. J
[ Could you send a patch against trunk, instead of the full blame.c file? Please also take a look at http://subversion.apache.org/docs/community-guide/ in general and at http://subversion.apache.org/docs/community-guide/general.html#patches in particular. I'm continuing below for the sake of having some discussion around this, regardless of the details of your patch. ] I think yours is an interesting approach, at least worth some discussion :-). I'm not a UTF-16 user myself, and I'm definitely not an expert in encoding matters, but I can certainly empathize with attempts to make subversion work nicely with UTF-16. There seems to have been some discussion around UTF-16 support in SVN in 2005, after this issue was filed: http://subversion.tigris.org/issues/show_bug.cgi?id=2194 (Support Unicode encodings other than UTF8 as plain text). The issue links to a couple of old discussion threads. For instance, there is this thread which highlights a couple of areas where specific support would have to be added: http://svn.haxx.se/users/archive-2005-01/0287.shtml [[[ * Diff * Merge * Keyword expansion * Newline conversion * Text/binary discrimination ... any others not thought about here? ]]] Here you are taking on "blame" (which is really just a series of "diff's"). It's interesting that this can be done with so little effort, just by performing conversion-to-UTF8 at the client layer. I'm not sure myself if that's an appropriate solution, even for a "temporary solution". But on the other hand, it seems there has been zero progress on this issue since 2005, so it might be worth it to think outside the box, and to look at some lightweight approaches that can give UTF-16 users some improvement. If this can be done incrementally, by adding support for specific subcommands by adding some conversions ... why not? -- Johan