2012/2/17 Vincent Lefevre :
> On 2012-02-17 13:54:35 +0900, Hiroaki Nakamura wrote:
>> Actually, whether filename is in NFC or NFD depends on the way of
>> inputting filenames.
>> If you type all characters, it is in NFC.
>
> No, or actually, perhaps this depends on the user configuration
> (e.g. k
On 2012-02-17 13:54:35 +0900, Hiroaki Nakamura wrote:
> Actually, whether filename is in NFC or NFD depends on the way of
> inputting filenames.
> If you type all characters, it is in NFC.
No, or actually, perhaps this depends on the user configuration
(e.g. keyboard configuration / input method).
2012/2/17 Vincent Lefevre :
> On 2012-01-30 21:29:41 +0100, Stefan Sperling wrote:
>> On Mon, Jan 30, 2012 at 09:09:22PM +0100, Branko Čibej wrote:
>> > Are you seriously proposing that we /support/ such broken, hackish
>> > nonsense? How do you expect users to tell the difference between file
>> >
On 2012-01-30 21:29:41 +0100, Stefan Sperling wrote:
> On Mon, Jan 30, 2012 at 09:09:22PM +0100, Branko Čibej wrote:
> > Are you seriously proposing that we /support/ such broken, hackish
> > nonsense? How do you expect users to tell the difference between file
> > names that look identical on the
On 12 feb 2012, at 16:59, Stefan Sperling wrote:
> On Sun, Feb 12, 2012 at 04:47:45PM +0100, Thomas Åkesson wrote:
>> Would it make sense to formalize the different approaches into a
>> couple of RFCs attempting to summarize the respective implications of
>> each approach? I could try to write on
On Sun, Feb 12, 2012 at 04:47:45PM +0100, Thomas Åkesson wrote:
> Would it make sense to formalize the different approaches into a
> couple of RFCs attempting to summarize the respective implications of
> each approach? I could try to write one up for the "Non-normalizing
> approach".
Detailed de
On 11 feb 2012, at 13:10, Hiroaki Nakamura wrote:
> Hi,
>
> 2012/2/9 Thomas Åkesson :
>> Hi,
>> I have been interested in this issue for a couple of years and I remember it
>> was discussed briefly at Subconf in Germany a couple of years ago.
>>
>> Branching the thread here because I'd like to
2012/2/11 Branko Čibej :
> On 11.02.2012 13:05, Hiroaki Nakamura wrote:
>> 2012/2/9 Markus Schaber :
>>> Von: Stefan Sperling [mailto:s...@elego.de]
>>> On Thu, Feb 09, 2012 at 12:20:14AM +0900, Hiroaki Nakamura wrote:
> [Upgrade options / backwards compatibility for proposed unicode
> nor
On 11.02.2012 13:05, Hiroaki Nakamura wrote:
> 2012/2/9 Markus Schaber :
>> Hi,
>>
>> Von: Stefan Sperling [mailto:s...@elego.de]
>>
>> On Thu, Feb 09, 2012 at 12:20:14AM +0900, Hiroaki Nakamura wrote:
[Upgrade options / backwards compatibility for proposed unicode
normalization fix]
>>>
Hi,
2012/2/9 Thomas Åkesson :
> Hi,
> I have been interested in this issue for a couple of years and I remember it
> was discussed briefly at Subconf in Germany a couple of years ago.
>
> Branching the thread here because I'd like to propose a different approach
> than Hiroaki. This proposition
2012/2/9 Markus Schaber :
> Hi,
>
> Von: Stefan Sperling [mailto:s...@elego.de]
>
> On Thu, Feb 09, 2012 at 12:20:14AM +0900, Hiroaki Nakamura wrote:
>> > [Upgrade options / backwards compatibility for proposed unicode
>> > normalization fix]
>
>> - Need to re-checkout existing working copies of t
Hi,
I have been interested in this issue for a couple of years and I remember it
was discussed briefly at Subconf in Germany a couple of years ago.
Branching the thread here because I'd like to propose a different approach than
Hiroaki. This proposition is not very different from the note
"uni
Hiroaki Nakamura wrote on Thu, Feb 09, 2012 at 07:16:57 +0900:
> 2012/2/9 Stefan Sperling :
> > - What happens if NFC/NFD is enabled in repository config, but the
> > repository contains non-normalised paths (i.e. did not go through
> > a dump/load cycle to normalise all paths)?
>
> I think w
Hi, thanks for your review.
2012/2/9 Stefan Sperling :
> Open questions:
Here I try to answer these. Of course, I welcome everyone to answer.
>
> - How can the client retrieve the configuration from the server?
> This is related to server-dictated configuration, see
> http://wiki.apache.org
On Thu, Feb 09, 2012 at 12:20:14AM +0900, Hiroaki Nakamura wrote:
> 2012/1/30 Stefan Sperling :
> > I think the following caveats would be acceptable if they help
> > with fixing the issue:
> >
> > - An upgrade path which optionally requires people to check all
> > working copies out again, when
2012/1/30 Stefan Sperling :
> I think the following caveats would be acceptable if they help
> with fixing the issue:
>
> - An upgrade path which optionally requires people to check all
> working copies out again, when either the server or the client is upgraded.
> Note again, this must be _op
On 07.02.2012 15:00, Stefan Sperling wrote:
> On Tue, Feb 07, 2012 at 02:43:19PM +0100, Branko Čibej wrote:
>> The client-side mapping table is a more general solution, if a
>> lot harder to implement.
>>
>> But it brings additional benefits in that we could use it to, e.g.,
>> transliterate charac
On Tue, Feb 07, 2012 at 02:43:19PM +0100, Branko Čibej wrote:
> The client-side mapping table is a more general solution, if a
> lot harder to implement.
>
> But it brings additional benefits in that we could use it to, e.g.,
> transliterate characters that are allowed by some file systems, but no
On 07.02.2012 14:30, Hiroaki Nakamura wrote:
> 2012/2/7 Branko Čibej :
>> On 06.02.2012 22:26, Hiroaki Nakamura wrote:
>>> The Unicode Standard says canonical equivalent sequences should be
>>> interpreted the same way.
>>> * 1.1 Canonical and Compatibility Equivalence
>>> http://unicode.org/repo
2012/2/7 Branko Čibej :
> On 06.02.2012 22:26, Hiroaki Nakamura wrote:
>> The Unicode Standard says canonical equivalent sequences should be
>> interpreted the same way.
>> * 1.1 Canonical and Compatibility Equivalence
>> http://unicode.org/reports/tr15/#Canonical_Equivalence
>> * 2.12 Equivalent
On Tue, Feb 07, 2012 at 06:26:54AM +0900, Hiroaki Nakamura wrote:
> 2012/2/6 Stefan Sperling :
> > 2) Do something else that effects repositories, too, and provide
> > a clean upgrade path for everyone (servers and clients).
> > AFAIK nobody has made a suggestion as to what could be done her
On 06.02.2012 22:26, Hiroaki Nakamura wrote:
> The Unicode Standard says canonical equivalent sequences should be
> interpreted the same way.
> * 1.1 Canonical and Compatibility Equivalence
> http://unicode.org/reports/tr15/#Canonical_Equivalence
> * 2.12 Equivalent Sequences and Normalization
>
2012/2/6 Stefan Sperling :
> On Mon, Feb 06, 2012 at 02:28:40PM +0100, Branko Čibej wrote:
>> On 06.02.2012 14:10, Hiroaki Nakamura wrote:
>> > Hi, all.
>> >
>> > It seems there is no further discussion.
>> >
>> > I think the conclusion for the short term solution is:
>> > We convert unnormalized p
On Mon, Feb 06, 2012 at 02:28:40PM +0100, Branko Čibej wrote:
> On 06.02.2012 14:10, Hiroaki Nakamura wrote:
> > Hi, all.
> >
> > It seems there is no further discussion.
> >
> > I think the conclusion for the short term solution is:
> > We convert unnormalized paths to NFC normalized paths on clie
On 06.02.2012 14:10, Hiroaki Nakamura wrote:
> Hi, all.
>
> It seems there is no further discussion.
>
> I think the conclusion for the short term solution is:
> We convert unnormalized paths to NFC normalized paths on clients only,
> that is, svn_path_cstring_to_utf8.
>
> It is the same approach a
Hi, all.
It seems there is no further discussion.
I think the conclusion for the short term solution is:
We convert unnormalized paths to NFC normalized paths on clients only,
that is, svn_path_cstring_to_utf8.
It is the same approach as utf8precompose_macosx_2.patch in
http://subversion.tigris.
2012/2/3 Julian Foad :
> You may well be correct that NFC is never longer than NFD, but that's not the
> question. The question is whether NFC may be longer than the current paths
> (which are not normalized to normalization form C or to form D). And the
> answer is yes it may be longer. See
Hiroaki Nakamura wrote:
>>> It would be nice if we could normalize paths in the repository without
>>> having to perform a dump/reload cycle, but I don't know how that
>>> would work in FSFS.
>>
>> It won't. Changing the encoding increase the length (in bytes) of the
>> string (in the dire
On Thu, Feb 2, 2012 at 10:59 PM, Hiroaki Nakamura wrote:
> 2012/2/3 Peter Samuelson :
>>
>>> On 02.02.2012 20:22, Peter Samuelson wrote:
>>> > By proposing a client-only solution, I hope to avoid _all_ those
>>> > questions.
>>
>> [Branko Cibej]
>>> Can't see how that works, unless you either make
2012/2/3 Peter Samuelson :
>
>> On 02.02.2012 20:22, Peter Samuelson wrote:
>> > By proposing a client-only solution, I hope to avoid _all_ those
>> > questions.
>
> [Branko Cibej]
>> Can't see how that works, unless you either make the client-side
>> solution optional, create a mapping table, or m
2012/2/3 Peter Samuelson :
>
> [Hiroaki Nakamura]
>> Existing repositories, I think it would be better to convert them too using
>> svndump/svnload. And we change svnload to convert filenames to NFC.
>> However in reality we cannot force users to convert every existing
>> repository.
>
> Also note
On 02.02.2012 21:28, Hiroaki Nakamura wrote:
> 2012/2/3 Branko Čibej :
>> On 02.02.2012 20:59, Hiroaki Nakamura wrote:
>>> So we need to change servers too. When servers read filenames from
>>> repositories, they first convert to NFC and then process commands.
>> That won't work. You have to do the
Hiroaki Nakamura wrote on Fri, Feb 03, 2012 at 05:33:02 +0900:
> 2012/2/3 Daniel Shahaf :
> > Branko Čibej wrote on Thu, Feb 02, 2012 at 21:03:47 +0100:
> >> On 02.02.2012 20:22, Peter Samuelson wrote:
> >> > [Hiroaki Nakamura]
> >> >> In option (2), we do n12n on all clients on all platforms, and
> On 02.02.2012 20:22, Peter Samuelson wrote:
> > By proposing a client-only solution, I hope to avoid _all_ those
> > questions.
[Branko Cibej]
> Can't see how that works, unless you either make the client-side
> solution optional, create a mapping table, or make name lookup on the
> server agno
2012/2/3 Daniel Shahaf :
> Branko Čibej wrote on Thu, Feb 02, 2012 at 21:03:47 +0100:
>> On 02.02.2012 20:22, Peter Samuelson wrote:
>> > [Hiroaki Nakamura]
>> >> In option (2), we do n12n on all clients on all platforms, and we
>> >> include web_dav_svn in "clients". So we convert all input paths
[Hiroaki Nakamura]
> Existing repositories, I think it would be better to convert them too using
> svndump/svnload. And we change svnload to convert filenames to NFC.
> However in reality we cannot force users to convert every existing repository.
Also note that if you convert a repository (via d
2012/2/3 Branko Čibej :
> On 02.02.2012 20:59, Hiroaki Nakamura wrote:
>> So we need to change servers too. When servers read filenames from
>> repositories, they first convert to NFC and then process commands.
>
> That won't work. You have to do the initial lookup in a
> normalization-agnostic way
Branko Čibej wrote on Thu, Feb 02, 2012 at 21:03:47 +0100:
> On 02.02.2012 20:22, Peter Samuelson wrote:
> > [Hiroaki Nakamura]
> >> In option (2), we do n12n on all clients on all platforms, and we
> >> include web_dav_svn in "clients". So we convert all input paths to
> >> the "server encoding",
On 02.02.2012 20:59, Hiroaki Nakamura wrote:
> So we need to change servers too. When servers read filenames from
> repositories, they first convert to NFC and then process commands.
That won't work. You have to do the initial lookup in a
normalization-agnostic way, and neither BDB nor FSFS makes
On 02.02.2012 20:22, Peter Samuelson wrote:
> [Hiroaki Nakamura]
>> In option (2), we do n12n on all clients on all platforms, and we
>> include web_dav_svn in "clients". So we convert all input paths to
>> the "server encoding", which is NFC.
> Indeed. But the very concept of a "server encoding"
2012/2/3 Peter Samuelson :
>
> [Hiroaki Nakamura]
>> In option (2), we do n12n on all clients on all platforms, and we
>> include web_dav_svn in "clients". So we convert all input paths to
>> the "server encoding", which is NFC.
>
> Indeed. But the very concept of a "server encoding" means we are
[Hiroaki Nakamura]
> In option (2), we do n12n on all clients on all platforms, and we
> include web_dav_svn in "clients". So we convert all input paths to
> the "server encoding", which is NFC.
Indeed. But the very concept of a "server encoding" means we are
involving the server side. Which in
[reordering the conversation flow slightly]
[Peter Samuelson]
> > That's the implementation I would like to see, to be honest. Start
> > with the observation that we can treat Mac OS X NFD paths as a
> > client character encoding. Now observe that it is lossy. But
> > ... almost all non-Unic
On 31.01.2012 02:47, Bert Huijben wrote:
> Last time we discussed this in depth (a few years ago), Windows didn't
> perform the normalization you describe here.
> Was this added later? (Any documentation pointers?)
Ouch, you're right ... Windows API doesn't normalize the paths.
-- Brane
> -Original Message-
> From: Branko Čibej [mailto:br...@xbc.nu]
> Sent: maandag 30 januari 2012 16:11
> To: dev@subversion.apache.org
> Subject: Re: Let's discuss about unicode compositions for filenames!
>
> On 31.01.2012 00:14, Peter Samuelson wrote:
> &
On 31.01.2012 00:14, Peter Samuelson wrote:
> [Stefan Sperling]
>> It is indeed harder because we are passing paths verbatim to sqlite.
>> I doubt having more than one form of a given path in wc.db is fun...
> That's the implementation I would like to see, to be honest. Start
> with the observatio
[Stefan Sperling]
> It is indeed harder because we are passing paths verbatim to sqlite.
> I doubt having more than one form of a given path in wc.db is fun...
That's the implementation I would like to see, to be honest. Start
with the observation that we can treat Mac OS X NFD paths as a client
On Mon, Jan 30, 2012 at 09:34:03PM +0100, Branko Čibej wrote:
> Sure, if you want to turn on such normalization, you pretty much have to
> dump and reload the repository as well as upgrading all working copies
> (again). Either that, or use form-independent comparison on the server,
> which isn't s
On 30.01.2012 21:29, Stefan Sperling wrote:
> On Mon, Jan 30, 2012 at 09:09:22PM +0100, Branko Čibej wrote:
>> Are you seriously proposing that we /support/ such broken, hackish
>> nonsense? How do you expect users to tell the difference between file
>> names that look identical on the character le
On Mon, Jan 30, 2012 at 09:09:22PM +0100, Branko Čibej wrote:
> Are you seriously proposing that we /support/ such broken, hackish
> nonsense? How do you expect users to tell the difference between file
> names that look identical on the character level, but are not on the
> code point level?
>
> S
On Mon, Jan 30, 2012 at 9:09 PM, Branko Čibej wrote:
> On 30.01.2012 21:00, Johan Corveleyn wrote:
>> On Mon, Jan 30, 2012 at 8:10 PM, Stefan Sperling wrote:
>>> On Tue, Jan 31, 2012 at 01:42:21AM +0900, Hiroaki Nakamura wrote:
2012/1/30 Stefan Sperling :
>> [ ... ]
>>
>>> And mixing various
On 30.01.2012 21:00, Johan Corveleyn wrote:
> On Mon, Jan 30, 2012 at 8:10 PM, Stefan Sperling wrote:
>> On Tue, Jan 31, 2012 at 01:42:21AM +0900, Hiroaki Nakamura wrote:
>>> 2012/1/30 Stefan Sperling :
> [ ... ]
>
>> And mixing various unicode forms works fine today if the filesystem
>> used by t
On Mon, Jan 30, 2012 at 8:10 PM, Stefan Sperling wrote:
> On Tue, Jan 31, 2012 at 01:42:21AM +0900, Hiroaki Nakamura wrote:
>> 2012/1/30 Stefan Sperling :
[ ... ]
> And mixing various unicode forms works fine today if the filesystem
> used by the client supports this. The use case Neels contrive
On Tue, Jan 31, 2012 at 01:42:21AM +0900, Hiroaki Nakamura wrote:
> 2012/1/30 Stefan Sperling :
> > My friend is not willing to upgrade to a new client version yet, which
> > is fine because all 1.x releases of Subversion clients are supposed
> > to be compatible with all 1.y releases of Subversion
On 01/30/2012 02:00 PM, Markus Schaber wrote:
> Maybe the best solution to this issue is a client-only solution, in a similar
> way the case sensitivity problem is tackled.
Spinning the client-only thought a bit: Imagine a repos with a un*x user
adding a file called "föö". Now an OSX user checks
Let me just note some of the main similarities and differences between this
issue of Unicode compositions and the issue of case-sensitivity in file names.
Differences:
* NFC and NFD look the same when
displayed, and most users haven't heard of them and don't expect that a
computer might treat
[Stefan Sperling]
> > We could also open the parent directory, read all the filenames
> > within it, normalise them all, and then search the resulting
> > list. This works, expect if a name exists twice, once in NFC form
> > and once in NFD form. We'd somehow have to solve the name collision
> >
On 30.01.2012 13:30, Stefan Sperling wrote:
> On Sun, Jan 29, 2012 at 07:38:44PM +0900, Hiroaki Nakamura wrote:
>> Hi folks!
>>
>> I read the note about unicode compositions for filenames
>> http://svn.apache.org/repos/asf/subversion/trunk/notes/unicode-composition-for-filenames
>> and would like t
On Sun, Jan 29, 2012 at 07:38:44PM +0900, Hiroaki Nakamura wrote:
> Hi folks!
>
> I read the note about unicode compositions for filenames
> http://svn.apache.org/repos/asf/subversion/trunk/notes/unicode-composition-for-filenames
> and would like to drive the discussion.
Hi,
I am very happy to h
59 matches
Mail list logo