Re: ffs and utf8

2014-12-03 Thread Dmitrij D. Czarkoff
Joel Rees said: > Maybe it would be better just to not make those directories until they > are needed by an application, and then ask the user to name them > instead of providing standard names. Actually, it is still workable if you carry your ~/.config/user-dirs.dir around, so that you could inst

Re: ffs and utf8

2014-12-03 Thread Theo de Raadt
>Joel Rees writes: >> 2014/12/03 22:23 "Dmitrij D. Czarkoff" : >> > >> > First of all, I really don't believe that preservation of non-canonical >> > form should be a consideration for any software. >> >> There is no particular canonical form for some kinds of software. >> >> Unix, in particular,

Re: ffs and utf8

2014-12-03 Thread Anthony J. Bentley
Joel Rees writes: > 2014/12/03 22:23 "Dmitrij D. Czarkoff" : > > > > First of all, I really don't believe that preservation of non-canonical > > form should be a consideration for any software. > > There is no particular canonical form for some kinds of software. > > Unix, in particular, happens

Re: ffs and utf8

2014-12-03 Thread Joel Rees
2014/12/03 22:23 "Dmitrij D. Czarkoff" : > > First of all, I really don't believe that preservation of non-canonical > form should be a consideration for any software. There is no particular canonical form for some kinds of software. Unix, in particular, happens to have file name limitations that

Re: ffs and utf8

2014-12-03 Thread Dmitrij D. Czarkoff
First of all, I really don't believe that preservation of non-canonical form should be a consideration for any software. There is no single reason to allow non-canonical forms to exist at all, while there are several reasons to avoid them. More so for foreign encodings in filenames - if you are t

Re: ffs and utf8

2014-12-03 Thread Joel Rees
On Wed, Dec 3, 2014 at 9:09 PM, Dmitrij D. Czarkoff wrote: > Anthony J. Bentley said: >> > I haven't used Apple OSses since around 10.4, but Mac OS X was doing a >> > thing where certain well-known directory names were aliased according to >> > the current locale. For instance, the user's "music"

Re: ffs and utf8

2014-12-03 Thread Dmitrij D. Czarkoff
Anthony J. Bentley said: > > I haven't used Apple OSses since around 10.4, but Mac OS X was doing a > > thing where certain well-known directory names were aliased according to > > the current locale. For instance, the user's "music" directory was shown > > as 「音楽」 when the locale was set to ja_JP

Re: ffs and utf8

2014-12-03 Thread Anthony J. Bentley
Joel Rees writes: > You can even handle broken UTF-8 and unconverted UTF-16/32 of whatever byte > order spit into the file name as a sequence of bytes if and only if you > escape NUL, slash, and your escape character properly, restoring the > escaped characters when putting the file names on the ne

Re: ffs and utf8

2014-12-03 Thread Joel Rees
Dmitrij had some questions about my intent, I'll try to clarify. 2014/12/02 18:57 "Joel Rees" : > > (apologies for the html.) > > 2014/12/02 9:52 "Dmitrij D. Czarkoff" : [ ... and others Snipped context: There was some discussion of what kind of file names should be allowed to be stored. There

Re: ffs and utf8

2014-12-02 Thread Joel Rees
(apologies for the html.) 2014/12/02 9:52 "Dmitrij D. Czarkoff" : > > Joel Rees said: > > Now, what would you do with this? > > > > ジョエル > > > > Why not decompose it to the following? > > > > ジョエル > > Because it is not what Unicode normalization is. Well, it definitely isn't Un

Re: ffs and utf8

2014-12-01 Thread Dmitrij D. Czarkoff
Joel Rees said: > Now, what would you do with this? > > ジョエル > > Why not decompose it to the following? > > ジョエル Because it is not what Unicode normalization is. > I know what the Unicode rules say, but my boss says, if I'm going to > play with file names, he wants it done his way. And now y

Re: ffs and utf8

2014-12-01 Thread Anthony J. Bentley
Ted Unangst writes: > On Mon, Dec 01, 2014 at 12:43, Dmitrij D. Czarkoff wrote: > > Janne Johansson said: > >> There is quite a bit of difference between changing the storage format and > >> making some dates "impossible" that previously did work. > > > > Don't think so. Something got changed, th

Re: ffs and utf8

2014-12-01 Thread Joel Rees
On Mon, Dec 1, 2014 at 11:13 PM, Dmitrij D. Czarkoff wrote: > Joel Rees said: >> Hmm. What would you suggest doing with the following file name? >> >> /etc >> >> (You may need a Japanese font to display it.) >> >> If you try to normalize it on a *nix box, it will hopefully conflict >> with your sy

Re: ffs and utf8

2014-12-01 Thread frantisek holop
Stefan Sperling, 29 Nov 2014 18:17: > > Are you aware of 'detox' package? > > There's also converters/convmv $ touch »´ÁÉǑÄ« $ convmv * wrong/unknown "from" encoding! $ convmv -f utf8 -t latin1 * Starting a dry run without changes... iso-8859-1 doesn't cover all needed characters for: "./»´ÁÉǑÄ«"

Re: ffs and utf8

2014-12-01 Thread frantisek holop
Joel Rees, 01 Dec 2014 22:04: > Hmm. What would you suggest doing with the following file name? > > /etc > > (You may need a Japanese font to display it.) > > If you try to normalize it on a *nix box, it will hopefully conflict > with your system file permissions. But, then what do you do with i

Re: ffs and utf8

2014-12-01 Thread Ted Unangst
On Mon, Dec 01, 2014 at 12:43, Dmitrij D. Czarkoff wrote: > Janne Johansson said: >> There is quite a bit of difference between changing the storage format and >> making some dates "impossible" that previously did work. > > Don't think so. Something got changed, things got broken and need to be >

Re: ffs and utf8

2014-12-01 Thread Dmitrij D. Czarkoff
Joel Rees said: > Hmm. What would you suggest doing with the following file name? > > /etc > > (You may need a Japanese font to display it.) > > If you try to normalize it on a *nix box, it will hopefully conflict > with your system file permissions. But, then what do you do with it? > > If you

Re: ffs and utf8

2014-12-01 Thread Joel Rees
On Mon, Dec 1, 2014 at 8:43 PM, Dmitrij D. Czarkoff wrote: > Janne Johansson said: >> There is quite a bit of difference between changing the storage format and >> making some dates "impossible" that previously did work. > > Don't think so. Something got changed, things got broken and need to be

Re: ffs and utf8

2014-12-01 Thread Dmitrij D. Czarkoff
Janne Johansson said: > There is quite a bit of difference between changing the storage format and > making some dates "impossible" that previously did work. Don't think so. Something got changed, things got broken and need to be fixed. The only real question is: is the change worth the trouble.

Re: ffs and utf8

2014-12-01 Thread Janne Johansson
2014-12-01 12:05 GMT+01:00 Dmitrij D. Czarkoff : > Stefan Sperling said: > > Bad idea. See my other post. Apple did this and broke existing > applications. > > OpenBSD changed time_t and broke existing applications, but hardly > anyone thinks it was a bad idea. Fancy filenames are long known to b

Re: ffs and utf8

2014-12-01 Thread Dmitrij D. Czarkoff
Stefan Sperling said: > Bad idea. See my other post. Apple did this and broke existing applications. OpenBSD changed time_t and broke existing applications, but hardly anyone thinks it was a bad idea. Fancy filenames are long known to be problematic, so filename policy enforcement is a breakage o

Re: ffs and utf8

2014-12-01 Thread Stefan Sperling
On Mon, Dec 01, 2014 at 10:20:08AM +0100, Dmitrij D. Czarkoff wrote: > I would enforce normalization at filename access time (open(), fopen(), > readdir(), etc). Yes, destructively transform. I would reject > filenames that won't decode. If this is documented, I just don't see > how it is "behin

Re: ffs and utf8

2014-12-01 Thread Stefan Sperling
On Mon, Dec 01, 2014 at 10:38:40AM +0200, pizdel...@gmail.com wrote: > On Sat, Nov 29, 2014 at 09:48:53PM +0100, Dmitrij D. Czarkoff wrote: > > That said, the standard provides just enough facilities to make > > filesystem-related aspects of Unicode work nicely, particularily in case > > of utf-8.

Re: ffs and utf8

2014-12-01 Thread Janne Johansson
2014-12-01 10:20 GMT+01:00 Dmitrij D. Czarkoff : > pizdel...@gmail.com said: > > How do you 'enforce' NFD? > > > > Let the kernel normalize (ie /destructively/ transform) the file names > > behind user's back, so that a file will be listed with a different name > > than that with which it was crea

Re: ffs and utf8

2014-12-01 Thread Dmitrij D. Czarkoff
pizdel...@gmail.com said: > How do you 'enforce' NFD? > > Let the kernel normalize (ie /destructively/ transform) the file names > behind user's back, so that a file will be listed with a different name > than that with which it was created? That's very nice and secure, indeed. I would enforce no

Re: ffs and utf8

2014-12-01 Thread pizdelect
On Sat, Nov 29, 2014 at 09:48:53PM +0100, Dmitrij D. Czarkoff wrote: > That said, the standard provides just enough facilities to make > filesystem-related aspects of Unicode work nicely, particularily in case > of utf-8. Eg. ability to enforce NFD for all operations on file names > could actually

Re: ffs and utf8

2014-12-01 Thread Anthony J. Bentley
Hi Ingo, Ingo Schwarze writes: > While the article is old, the essence of what Schneier said here > still stands, and it is not likely to fall in the future: > > https://www.schneier.com/crypto-gram-0007.html#9 The most interesting sentence here is: "Unicode is just too complex to ever be sec

Re: ffs and utf8

2014-11-30 Thread Christian Weisgerber
On 2014-11-29, Ingo Schwarze wrote: > But Unicode must never be allowed near anything that might get > executed as program code, including scripts in interpreted languages, > including, but not limited to, the shell. In particular, that means > trying to handle Unicode in filenames is a bad idea

Re: ffs and utf8

2014-11-30 Thread Joel Rees
On Sun, Nov 30, 2014 at 6:31 PM, Dmitrij D. Czarkoff wrote: > Joel Rees said: >>> That said, the standard provides just enough facilities to make >>> filesystem-related aspects of Unicode work nicely, particularily in case >>> of utf-8. Eg. ability to enforce NFD for all operations on file names

Re: ffs and utf8

2014-11-30 Thread Dmitrij D. Czarkoff
Thomas Bohl said: > # ls | cat > Will display the characters right. > Not entirely sure why though. >From ls(1) manual: | -q Force printing of non-graphic characters in file names as the | character `?'; this is the default when output is to a terminal. -- Dmitrij D. Czarkoff

Re: ffs and utf8

2014-11-30 Thread Dmitrij D. Czarkoff
Joel Rees said: >> That said, the standard provides just enough facilities to make >> filesystem-related aspects of Unicode work nicely, particularily in case >> of utf-8. Eg. ability to enforce NFD for all operations on file names >> could actually make several things more secure by preventing ho

Re: ffs and utf8

2014-11-29 Thread Thomas Bohl
Am 29.11.2014 um 13:20 schrieb frantisek holop: i think i should clarify this a bit: they show perfect in midnight commander, not in shell. $ touch »´ÁÉǑÄ« $ ls ?? # ls | cat Will display the characters right. Not entirely sure why though.

Re: ffs and utf8

2014-11-29 Thread Joel Rees
On Sun, Nov 30, 2014 at 5:48 AM, Dmitrij D. Czarkoff wrote: > Ingo Schwarze said: >> While the article is old, the essence of what Schneier said here >> still stands, and it is not likely to fall in the future: >> >> https://www.schneier.com/crypto-gram-0007.html#9 > > Sorry, but this article is

Re: ffs and utf8

2014-11-29 Thread Dmitrij D. Czarkoff
Ingo Schwarze said: > While the article is old, the essence of what Schneier said here > still stands, and it is not likely to fall in the future: > > https://www.schneier.com/crypto-gram-0007.html#9 Sorry, but this article is mostly based on lack of understanding of Unicode. > that would dire

Re: ffs and utf8

2014-11-29 Thread Stefan Sperling
On Sat, Nov 29, 2014 at 02:08:32PM +0200, Ville Valkonen wrote: > Hello, > > On 29 November 2014 at 14:02, frantisek holop wrote: > > i have written for myself a small python3 script that > > removes accented characters and all utf8 "symbols" > > from filenames, a kind of "utf-8 to ascii sanitize

Re: ffs and utf8

2014-11-29 Thread Jan Stary
On Nov 29 13:02:34, min...@obiit.org wrote: > is it true to say then, that ffs is entirely "utf8 safe", > and/or that ffs is actually "an utf-8 encoded filesystem" The file names are just strings of bytes. There is nothing "UTF8" about them. On Nov 29 14:23:35, czark...@gmail.com wrote: > (Intere

Re: ffs and utf8

2014-11-29 Thread Ingo Schwarze
Hi, Paolo Aglialoro wrote on Sat, Nov 29, 2014 at 01:56:23PM +0100: > Shouldn't in 2014 the aim having all working in utf-8? Most definitely not, that would directly run contrary to some of OpenBSD's most important project goals: Correctness, simplicity, security. While the article is old, the

Re: ffs and utf8

2014-11-29 Thread Christian Weisgerber
On 2014-11-29, frantisek holop wrote: > $ touch »´ÁÉǑÄ« > $ ls > ?? If you need a locale-aware ls(1), use the one from the colorls package. (Don't worry, colored output is entirely optional.) -- Christian "naddy" Weisgerber na...@mips.inka.de

Re: ffs and utf8

2014-11-29 Thread Christian Weisgerber
On 2014-11-29, frantisek holop wrote: > is it true to say then, that ffs is entirely "utf8 safe", > and/or that ffs is actually "an utf-8 encoded filesystem" > as IIRC Mac OS is? The former. Unix filesystems accept all bytes for filenames with the exception of 0x2f, which serves as directory se

Re: ffs and utf8

2014-11-29 Thread Ted Unangst
On Sat, Nov 29, 2014 at 13:02, frantisek holop wrote: > is it true to say then, that ffs is entirely "utf8 safe", > and/or that ffs is actually "an utf-8 encoded filesystem" > as IIRC Mac OS is? or is it some kind of happy accident > that it works? :) FFS stores filenames as bytes.

Re: ffs and utf8

2014-11-29 Thread Lars
Hi, On 29.11.2014 13:20, frantisek holop wrote: i think i should clarify this a bit: they show perfect in midnight commander, not in shell. $ touch »´ÁÉǑÄ« $ ls ?? -f I had a similar problem some time ago and have been told that the ls tool is not aware of UTF-8. See here for s

Re: ffs and utf8

2014-11-29 Thread Dmitrij D. Czarkoff
frantisek holop said: > is it true to say then, that ffs is entirely "utf8 safe", > and/or that ffs is actually "an utf-8 encoded filesystem" > as IIRC Mac OS is? or is it some kind of happy accident > that it works? :) As I get it, ffs is entirely "utf8 safe" because it is not encoding aware. W

Re: ffs and utf8

2014-11-29 Thread frantisek holop
Paolo Aglialoro, 29 Nov 2014 13:56: > Shouldn't in 2014 the aim having all working in utf-8? sure. but i like my filenames ascii and whitespaceless. shows my age. -f -- what a nice night for an evening. -- steven wright

Re: ffs and utf8

2014-11-29 Thread Paolo Aglialoro
Shouldn't in 2014 the aim having all working in utf-8?

Re: ffs and utf8

2014-11-29 Thread frantisek holop
Ville Valkonen, 29 Nov 2014 14:08: > Are you aware of 'detox' package? $ touch »´ÁÉǑÄ« $ detox * $ ls A_A_A_A_C_A_A_ $ touch »´ÁÉǑÄ« $ my_silly_script $ ls aeoa perhaps with some massaging detox can be made to work like my script, i dont know. but that is actually besides the point. i wrote my

Re: ffs and utf8

2014-11-29 Thread frantisek holop
frantisek holop, 29 Nov 2014 13:02: > while working on it, i created some strange test cases > (e.g. »´ÁÉǑÄ«) for filenames and i was pleasently > surprised that the files were created/read/renamed/deleted > without problems. i think i should clarify this a bit: they show perfect in midnight comma

Re: ffs and utf8

2014-11-29 Thread Ville Valkonen
Hello, On 29 November 2014 at 14:02, frantisek holop wrote: > i have written for myself a small python3 script that > removes accented characters and all utf8 "symbols" > from filenames, a kind of "utf-8 to ascii sanitizer". Are you aware of 'detox' package? -- Regards, Ville

ffs and utf8

2014-11-29 Thread frantisek holop
i have written for myself a small python3 script that removes accented characters and all utf8 "symbols" from filenames, a kind of "utf-8 to ascii sanitizer". while working on it, i created some strange test cases (e.g. »´ÁÉǑÄ«) for filenames and i was pleasently surprised that the files were crea