Dear diary, on Sun, Apr 17, 2005 at 08:13:59PM CEST, I got a letter where "David A. Wheeler" <[EMAIL PROTECTED]> told me that... > On Sun, 17 Apr 2005, Russell King wrote: > >>BTW, there appears to be "errors" in the history committed thus far. > >>I'm not sure where this came from though. Some of them could be > >>UTF8 vs ASCII issues, ....> > ... > >>One thing which definitely needs to be considered is - what character > >>encoding are the comments to be stored as? > > Linus Torvalds replied: > >To git, it's just a byte stream, and you can have binary comments if you > >want to. I personally would prefer to move towards UTF eventually, but I > >really don't think it matters a whole lot as long as 99.9% of everything > >we'd see there is still 7-bit ascii. > > I would _heartily_ recommend moving towards UTF-8 as the > internal charset for all comments. Alternatives are possible > (e.g., recording the charset in the header), but they're > incredibly messy. Even if you don't normally work in UTF-8, > it's pretty easy to set most editors up to read & write UTF-8. > Having the data stored as a constant charset eliminates > a raft of error-prone code.
I tend to agree here. My toilet stuff is what can handle various locale-based conversions at the commit-tree / cat-file tree sides etc, but UTF-8 should be certainly the way to go internally. Not that the plumbing should actually _care_ at all; anyone who uses it should take the care, so this is more of a "social" thing. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html