Hi! > For me, the difference is that I expect further work to be done on > improving ICU, while I lack that confidence for mbstring. If the API
My experience over the years has been that established supported libraries like ICU usually have better track record in improving and maintenance than more niche libraries, but it differs a lot from case to case. I have no idea though how good/bad is ICU in detecting Asian languages and encodings. >> Developers should not rely on encoding detector, but they should validate >> encoding. >> > I think everyone agrees on that. :) True, but also incomplete. There's ideal case, and there's real world. In ideal case, you know encodings of everything and everything is nicely specified and shiny and rainbows and unicorns abound. Real data, though, is messy and unpredictable and comes from places and practices that makes one shudder. And when it comes to that we can either give the developers at least something - an imperfect encoding detector, with all caveats - or just ignore it and not give them anything, because it is not matching our theories. and leave them to implement even worse hacks. I think the former is much better approach. And of course, detection and validation is a different thing. A text may look like valid string in encoding A but actually be encoding B. "Tell me if this data looks like Russian text in KOI-8 or Japanese text in Shift-JIS" and "tell me if this is a valid or invalid UTF-8" are two completely different tasks. -- Stas Malyshev smalys...@gmail.com -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php