Important Update: The Cause and The Solution!

Hello,

As I have noted before, this problem manifests itself in Kubuntu too.
So this problem is not rooted in the "update-manager" which is Ubuntu
and GNOME specific.

Fortunately, I have found the solution to this problem. The actual problem
is in the "apt-get" program and "libapt-pkg" library included in the "apt"
package.

The root cause of this bug is the following: when "apt" and/or "libapt-pkg"
process in the Turkish locale packages whose information contains the letters
"i" or "I", they get confused because the Turkish "i"s have different
capitalization rules compared to the English "i". In summary, the following
is valid in the Turkish locale: "i" -> "İ" and "ı" -> "I", whereas in the
English locale: "i" -> "I". So,

=== 8< ===
strcasecmp("wip", "WIP")
=== >8 ===

does /NOT/ return "0" in the Turkish locale. Similarly,

=== 8< ===
toupper('i') != 'I'
tolower('I') != 'i'
=== >8 ===

in the Turkish locale. In tr_TR.UTF-8, the "toupper('i')" and "tolower('I')"
simply return "i" and "I", respectively. This is because "toupper" and
"tolower" functions can only return one byte characters and not multi-byte
UTF-8 characters.

I came to the conclusion that this problem had to do with the Turkish "i"s
after examining the headers of the affected packages (which are e2fsprogs,
initscripts and nfs-kernel-server.) To be specific, let's look at their
"Replaces" and "Requires" headers:

=== 8< ===
e2fsprogs: Replaces: hurd (<= 20040301-1), libblkid1 (<< 
1.38+1.39-WIP-2005.12.10-2), 
                     libuuid1 (<< 1.38+1.39-WIP-2005.12.10-2)

initscripts: Depends: debianutils (>= 2.13.1),e2fsprogs (>= 
1.32+1.33-WIP-2003.04.14-1),
                      libc6 (>= 2.4), lsb-base (>= 3.0-6), mount (>= 2.11x-1),
                      passwd, sysvutils

nfs-kernel-server: Depends: libblkid1 (>= 1.39+1.40-WIP-2006.11.14+dfsg-2), 
libc6 (>= 2.4),
                            libcomerr2 (>= 1.33-3), libgssglue1, libkrb53 (>= 
1.6.dfsg.2),
                            libldap-2.4-2 (>= 2.4.7), libnfsidmap2, 
librpcsecgss3, libwrap0,
                            lsb-base (>= 1.3-9ubuntu3), nfs-common (>= 
1:1.0.8-1), ucf
=== >8 ===

As you can see all of these headers contain the string "WIP" which
contains the capital "I". I am kind of experienced in seeing the
malfunction of GNU/Linux programs in the Turkish locale because of
the differences of the Turkish "i"s compared to the English "i".
And because of this, I thought that this piece of information could
be the key.

Most of the time, a program uses "toupper()", "tolower()" or "strcasecmp()"
in the Turkish locale and expects that it behaves as it does in the English
or C locale. However, this is not case because of the Turkish "i"s.

A quick look at "apt-pkg/deb/deblistparser.cc" confirms my guess:

=== 8< ===
191  unsigned short debListParser::VersionHash()
192  {
193     const char *Sections[] ={"Installed-Size",
194                              "Depends",
195                              "Pre-Depends",
196  //                            "Suggests",
197  //                            "Recommends",
198                              "Conflicts",
199                              "Breaks",
200                              "Replaces",0};
201     unsigned long Result = INIT_FCS;
202     char S[1024];
203     for (const char **I = Sections; *I != 0; I++)
204     {
205        const char *Start;
206        const char *End;
207        if (Section.Find(*I,Start,End) == false || End - Start >= 
(signed)sizeof(S))
208           continue;
209
210        /* Strip out any spaces from the text, this undoes dpkgs reformatting
211           of certain fields. dpkg also has the rather interesting notion of
212           reformatting depends operators < -> <= */
213        char *I = S;
214        for (; Start != End; Start++)
215        {
216           if (isspace(*Start) == 0)
217              *I++ = tolower(*Start);
218           if (*Start == '<' && Start[1] != '<' && Start[1] != '=')
219              *I++ = '=';
220           if (*Start == '>' && Start[1] != '>' && Start[1] != '=')
221              *I++ = '=';
222        }
223
224        Result = AddCRC16(Result,S,I - S);
225     }
226
227     return Result;
228  }
=== >8 ===

As you can see on the lines 193 to 200 and 203, the "Depends" and "Replaces"
sections are used to get the "version hash". Note the use of "tolower" on
line 216. As I have shown above, the aforementioned packages (initscripts
and e2fsprogs) have the string "WIP" in their "Depends" or "Replaces" header
lines. When the string "WIP" is processed in the function above, it gets
converted to "wIp" by "tolower". (Why "I" isn't converted to "i" is explained
earlier in this message.) Hence, the problem is most probably because of the
use of the "tolower" function in the user's locale (such as tr_TR.UTF-8) even
though it is going to process ASCII-only data. This happens to be true.

Let's reproduce this bug on any system, running on the Turkish locale or not...
(You need to have the Turkish locale generated/installed though.)

Note: I will attach the following script as well. Please note the backward
slashes which should be at the end of the lines starting with LANG=

=== testapt.sh ===
#!/bin/bash

echo
echo "### Killing update-notifier"
sudo killall update-notifier

echo
echo "### Restarting update-notifier in tr_TR.UTF-8 locale"
LANG=tr_TR.UTF-8 LC_CTYPE=tr_TR.UTF-8 \
update-notifier --startup-delay 0 &

echo
echo "### Sleeping for 20 seconds (until update-notifier initializes)"
sleep 20

echo
echo "### Removing apt's cache files"
sudo rm -v -f /var/cache/apt/*.bin

echo
echo "### Running apt-get update in C locale"
LANG=C LC_ALL=C sudo apt-get update

echo
echo "### Re-installing a package in tr_TR.UTF-8 locale"
LANG=tr_TR.UTF-8 LC_CTYPE=tr_TR.UTF-8 LC_MESSAGES=en_US.UTF-8 \
sudo apt-get --yes --reinstall install eject

echo
echo "### Sleeping for a second"
sleep 1

echo
echo "### Showing the effects of the bug via"
echo "### an apt-get dist-upgrade in tr_TR.UTF-8 locale"
LANG=tr_TR.UTF-8 LC_CTYPE=tr_TR.UTF-8 LC_MESSAGES=en_US.UTF-8 \
sudo apt-get dist-upgrade

=== testapt.sh ===

Copy and paste these lines into a file named testapt.sh and
run "chmod +x testapt.sh", then run this file. Type in your
password, and let it roll. Wait patiently until it finishes.

After 30 seconds or so, you should see that there are 2 updates
waiting for you: e2fsprogs and initscripts. (If you have nfs-kernel-server
installed, then the number is 3, and nfs-kernel-server is in the list
of updated packages too.)

I have written two patches which fix this problem. Basically, both of
them make use of locale-independent "toupper()", "tolower()",
"strcasecmp()" and "stringcasecmp()" functions where needed, which
happens to be almost everywhere. These functions are written/copied
from GNU coreutils and are named "c_toupper()", "c_tolower()",
"c_strcasecmp()" and "c_stringcasecmp()". Their main difference from
their locale-dependent counterparts is that they always operate as
if they are in the "C" locale.

Of the two patches, the first one is "small"; it only fixes the cause
of this particular bug and doesn't touch any other problematic
tolower/toupper/*casecmp lines. The second one is "full"; it replaces
most of the toupper/tolower/*casecmp functions in the source code of
apt with their locale-independent counterparts.

I am going to attach both of these patches; however, I would prefer to
have apt's maintainer apply the "full" patch - this way the whole code
becomes "Turkish-locale-ready". :^)

I have also rebuilt the "apt" package with the "full" patch applied.
(I am going to attach the rebuilt package too.)

If you install my patched "apt" package and after that run the
"testapt.sh" file again, you should see that there are no more
packages waiting for unnecessary updates.

I hope that my patch will get applied quickly.

Regards,

M. Vefa Bicakci

Note: Mr. Vogt: Is there any way that the full patch can get applied
to Debian's apt package too?

-- 
e2fsprogs initscripts volumeid Update Issue
https://bugs.launchpad.net/bugs/80248
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to