Tomohiro KUBOTA: > $title =~ s/^#use .* title="(.+?)(" .*$|"$|\e.*$)/$1/; > > I think it should be modified as: > > $title =~ s/^#use .* title="(.+?)("\s.*$|"$)/$1/;
That does not work (that was my first attempt), because there are some Japanese pages that have title="<switch to 0208>DBCS<switch to 0201>"<switch to ASCII><space> and those were not matched properly. However, I seem to have missed a quotation mark missing in the regexp, it should read: $title =~ s/^#use .* title="(.+?)(" .*$|"$|"\e.*$)/$1/; ^ > (I also modified to use \s instead of 0x20 space because it can > match tab. This is not related to the problem we are discussing > now.) That might be a good idea as well. I can't commit a fix right now, the computer I have the CVS checked out to is currently disassembled since I removed a failing cd-rom drive. I'll try to fix it later today, though. -- \\// peter - http://www.softwolves.pp.se/ Statement concerning unsolicited e-mail according to Swedish law: http://www.softwolves.pp.se/peter/reklampost.html