On 24 Sep 2014, at 19:09, Benjamin Pollack <benja...@bitquabit.com> wrote:
> On Wed, 24 Sep 2014 13:03:57 -0400, Sven Van Caekenberghe <s...@stfx.eu> > wrote: > >> >> Did you read the actual conversation in the issue ? >> >> https://pharo.fogbugz.com/f/cases/14054/Issue-with-path-with-accented-characters >> >> It has been renamed and there is a fix (as a change set, not as a slice, >> yet). Basically, there was a primitive call into a plugin that failed to do >> encoding. >> > > No, I apologize; I missed the bug link. Thanks for reposting it. > >> Now regarding the issues you raised. Pharo does not do Unicode >> canonicalisation or any of that other fancy stuff (like categorisation, >> proper ordering and so on). This is another orthogonal and way more general >> issue. >> >> Regarding the pathnames encoding: if the OS itself does not know it, how can >> we ? > > That's actually the argument *against* using UTF-8 as the standard Pharo way > to represent filenames--at least on Unix systems. If Pharo used ByteArrays > to represent paths, with convenience methods for working with UTF-8 (since I > do agree that's the most likely thing for a user/dev to want), then you'd be > able to work with all files no matter what, *and* have a convenient way of > doing so for the common case. > > This is an old discussion, and I do see both sides of it. In terms of SCMs, > Mercurial and Git both just say "it's a collection of bytes", whereas > Subversion says "it's Unicode code points." This has some uncomfortable > implications for both systems when working on multiple platforms. Benjamin, I think I understand the concern / situation that you describe. But I fail to see how not-interpreting it and interpreting it in different encodings can work in practice, especially since your point seems to be that there is no meta information that gives a definitive answer. I would guess that other languages, say Java or Python, have some approach to handle this problem ? Also, since we are living with the current approach without much problems, I think the issue is not terribly pressing. Sven