On 2009-02-11, Torsten Curdt <tcu...@apache.org> wrote: > I am also not so sure this really all that bad. I guess there are 3 scenarios
> 1: the archive standard is known to use a specific encoding > 2: the encoding is specified inside the archive (which is similar to 1) > 3: we have no clue about the encoding of the strings in the archive > Unless I am missing something we are fine for 1+2 because as long as > we create the strings as we should. It's up to the user of compress to > turn this into something he can use on his platform. > For 3 there is just no point. All we can do is provide a way to get > the name and the user needs to figure out the conversion. Nothing we > can do about it. > So I guess we all we need to do is to be sure not to create Strings in > the default encoding for 1+2. > Or what am I missing? Not much, except that java.uti.zip.Zip*putStream and thus the "old" ZipArchiveOutputStream always are in your state 1: UTF-8. This also means they are unable to create or read anything but UTF-8. The new ZipArchiveOutputStream uses the platform's default which makes your case 3 more likely unless people take care to note the encoding when creating the archive. Note that more modern ZIP implementations provide a way to explicitly say "this is UTF-8" inside the archive and SANDBOX-176 contains a patch that claims to support that (as does <https://issues.apache.org/bugzilla/show_bug.cgi?id=45548>) - I'll be looking into it. Stefan --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org