Tomohiro KUBOTA: > Imagine an ISO-2022-JP string has a JIS X 0208 part and following > ASCII part. When the JIS X 0208 part ends with 0x22, it matches "\e > and thus the regexp will fail.
Yes, I am aware of that, but since regular expressions are not powerful enough to parse all possible combinations of this, I can't do it entirely the "right" way. *If* the problem arises, we'll have to implement some special case for parsing the Japanese titles. (Of course, if the Japanese pages would have used a stateless encoding, such as EUC-JP, this wouldn't have been a problem). -- \\// peter - http://www.softwolves.pp.se/ Statement concerning unsolicited e-mail according to Swedish law: http://www.softwolves.pp.se/peter/reklampost.html