-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 André,
On 9/22/2009 3:58 PM, André Warnier wrote: > your detailed analysis is impressive and undoubtedly accurate, but > beyond what I can swallow right now in Java and after 2 glasses of > Spanish wine. It's probably better than having 2 pints of Belgian beer. Wow. > So let me ask a simple question : > - a file named "fichié.txt" has been created in a directory, by a > process that spoke iso-8859-1 (so the filename is 10 bytes long). Ok. > - a Tomcat runs in a process whose locale is set to UTF-8, and an > application inside this Tomcat reads the filename from the directory > into a Java String variable S. > What happens ? > - does the application get an exception due to invalid encoding ? No. The results of my other test suggest that you basically just get garbage characters in the filename. > - if not, why not ? Good question. Maybe the JVM authors decided that garbage characters were better than an inaccessible file (and I tend to agree with that trade-off). > - if not, what is now the content, in bytes, of variable S ? Heh. Beats me. I couldn't understand how the UTF-8 filename had been mangled when in ANSI mode, so I'm not sure if such mangling is reversible. I wonder if you could re-encode the filename something like this: String encoding = System.getProperty("file.encoding"); String filename = file.getName(); // gets you junk String recoded = new String(filename.getBytes(encoding), "UTF-8"); Of course, this only works if: 1. the file was originally written in UTF-8 mode 2. The ANSI mangling that has occurred is reversible using the above method (duh) If you have some suspicion as to the encoding used to encode the filename in the first place, you could re-code the filename several times and attempt a match (using String.equals). Better yet, you could re-code the filename you /think/ you have into a String and then use that to check against the filesystem. I dunno. This is pretty ugly. Again, setting everything to UTF-8 dramatically reduces headaches in these areas. - -chris -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkq5L04ACgkQ9CaO5/Lv0PAlGQCdEjzO/3Ikf1ooQDVmkpzOiLl1 j0IAn1NiU8tbcdMGDra6thzvPFYml1m3 =yOp/ -----END PGP SIGNATURE----- --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org