Hi,

At Fri, 6 Jul 2001 18:53:57 +0200 (CEST),
peter karlsson <[EMAIL PROTECTED]> wrote:

> I have committed a fix now. It seems to work on my local machine (I
> can't read Japanese, but I can see that there is no mis-encoding left).

Thanks.  I checked.

I found many items read only "Debian".  These pages have titles of
"Debian <someting Japanese>", which are "Debian <esc><JIS X 0208
specifier string><JIS X 0208 literal><esc><ASCII specifier string>"
in bytes.  Thus, the first <esc> matches the regexp to end $title.
(Note the second <esc> also cannot end $title.  Well, <esc> cannot
be a end sign in any )

  $title =~ s/^#use .* title="(.+?)(" .*$|"$|\e.*$)/$1/;

I think it should be modified as:

  $title =~ s/^#use .* title="(.+?)("\s.*$|"$)/$1/;

I tested locally (as an independent perl script) and it works well
for such pages.

(I also modified to use \s instead of 0x20 space because it can
match tab.  This is not related to the problem we are discussing
now.)

---
Tomohiro KUBOTA <[EMAIL PROTECTED]>
http://www.debian.or.jp/~kubota/
"Introduction to I18N"  http://www.debian.org/doc/manuals/intro-i18n/


Reply via email to