Hi, At Fri, 6 Jul 2001 17:21:34 +0200, Josip Rodin <[EMAIL PROTECTED]> wrote:
>> my $title = `egrep '^#use .* title=' $page `; chomp $title; >> $title =~ s/^#use .* title="([^"]+)".*$/$1/; > I suppose we could just change that regexp to match everything after the > opening double quote up to the next space, and strip off the ending quote. Nice idea. Just an improvement: I think some titles may include whitespaces. Thus, the end of the title should be checked by two continuing bytes of double quote and following whitespace/linefeed byte. Since ISO-2022-JP cannot end with JIS X 0208 shift state (where 0x22 may appear for Japanese characters), 0x22 cannot appear at the end of the title string. And more, since the byte range of JIS X 0208 is 0x21 - 0x7e, 0x22-and-whitespace bytes cannot appear in ISO-2022-JP strings. Could someone CVS committer please implement this to webwml/english/sitemap.wml ? --- Tomohiro KUBOTA <[EMAIL PROTECTED]> http://www.debian.or.jp/~kubota/ "Introduction to I18N" http://www.debian.org/doc/manuals/intro-i18n/