> per user settings by their locales instead of global settings
Also suggested in
https://bugs.launchpad.net/ubuntu/+source/unzip/+bug/2066389
with a link to reference implementation in p7zip
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to U
** Changed in: unzip (Debian)
Status: Confirmed => Fix Released
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1422290
Title:
Default charsets handling for Windows archives in CJKV+th locale
Wrote a patch for unzip fixing this issue:
https://sourceforge.net/p/infozip/patches/29/
The same patch for p7zip:
https://sourceforge.net/p/p7zip/bugs/187/
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bu
Bug #1462848 could be a duplicate.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1422290
Title:
Default charsets handling for Windows archives in CJKV+th locale
To manage notifications about this b
Followed up on https://code.launchpad.net/~nobuto/ubuntu/wily/unzip
/fallback-encoding/+merge/268850 with some feedback about the patch.
There are better ways to achieve this than through profile.d.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed
> Did upstream say anything?
I've got a reply from a developer of unzip, but he is also not familiar
with those charset issues. I need to discuss it more in upstream.
However what I'm trying to do here is a relatively short-term solution.
I believe the request in the attached branch is still valid
Upstream won't apply such a behavior as they regard it as locale hacks.
GBK is a superset of cp936 but is not too big to cover portions of UTF-8
(so it can be reliably detected, not like GB18030). It's better to use
GBK than cp936 from this POV.
--
You received this bug notification because you
Did upstream say anything?
What is "GBK" that Kylin uses and why is it different from the one we
have here?
Sorry for being clueless. :)
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1422290
Title:
** Branch linked: lp:~nobuto/ubuntu/wily/unzip/fallback-encoding
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1422290
Title:
Default charsets handling for Windows archives in CJKV+th locale
To man
I have sent an enhancement request to upstream through http://www.info-
zip.org/zip-bug.html since the issue is still reproducible with
6.1c19-BETA which you can try from:
https://launchpad.net/~nobuto/+archive/ubuntu/build-test/+build/7630500
Putting a copy of the request here for your reference.
It seems like there are no Ubuntu developers that feel like reviewing
those changes, it would be good to get that reviewed upstream and/or in
Debian...
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/142
Any progress?
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1422290
Title:
Default charsets handling for Windows archives in CJKV+th locale
To manage notifications about this bug go to:
https://bug
@Yuan,
For example "王妃.zip" you posted, it has short file names in the archive. Even
with unar/lsar it fails to detect encoding (you expect CP932, but lsar shows
it's ISO-8859-8). Auto detection of encoding is not 100% reliable especially
with short file names (less hints for encoding detector)
This is from one of my machine running LUbuntu:
$ export |grep LANG
declare -x LANG="en_US.UTF-8"
$ export |grep LC
declare -x LC_ADDRESS="en_US.UTF-8"
declare -x LC_IDENTIFICATION="en_US.UTF-8"
declare -x LC_MEASUREMENT="en_US.UTF-8"
declare -x LC_MONETARY="en_US.UTF-8"
declare -x LC_NAME="en_US
@yuanchao, you cannot recover file name when it's decompressing with
unzip (because characters are replaced by question marks), but you can
do that when using 7zip.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad
Well, without this trick, the filenames could be recovered with
'convmv'. But with this trick, it would be scrambled further... Still I
personally prefer an auto-detect plus this fallback or an option in the
GUI, like file-roller.
--
You received this bug notification because you are a member of
@yuanchao, with or without this trick, running unzip would lead to
garbled file name for you, so I don't think this change would bother you
that much like you describe, does it?
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://b
Dear @Nobuto,
I appreciate the patch work very much, but it simply doesn't fit my use case.
Quite frequently, I get
zip files with CJK file names from zh_CN and ja_JP. (my environment is either
zh_TW or en_US, the
later which is for office desktop PC) Changing LC_CTYPE to something other than
U
@Yuan,
My patch refers LC_CTYPE first, so you can specify different locale to
LC_CTYPE and LC_MESSAGES for example. And of cource you can manually
export UNZIP and ZIPINFO variables on your ~/.profile. I understand my
patch is for short-term workaround.
FWIW, unar supports encoding autodetection,
It would be nice to have some auto-detect mechanism on top of this
locale fallback. For my personal case, most zip files that need to
specify the encoding is not the same as my corresponding locale setting.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is su
** Changed in: unzip (Debian)
Status: Unknown => Confirmed
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1422290
Title:
Default charsets handling for Windows archives in CJKV+th locale
To ma
** Bug watch added: Debian Bug tracker #483290
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=483290
** Also affects: unzip (Debian) via
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=483290
Importance: Unknown
Status: Unknown
--
You received this bug notification because you
** Description changed:
- This branch adds default charsets handling for Windows archives in
+ With the current unzip package in Ubuntu, we need to specify charset
+ explicitly to extract zip files sent from localized Windows systems.
+
+ For example zip files sent from Japanese localized Windows
Additional background:
On Windows, file names are encoded with different encoding for CJKV+th
locales, while ZIP archive does not store file name encoding
information. When decompressing the ZIP archive on system with another
encoding (i.e. UTF-8 on Linux), the file names are garbage and those
cha
24 matches
Mail list logo