Hi, Le 1 mars 2015 21:26, "Derick Rethans" <der...@php.net> a écrit : > > Hey Joe, > > I think there are a few issues with the proposal, although I like the > general idea. I've had the tab with the RFC open since October... but > never looked at it until now :-/. So, a few comments: > > - UString as a name. > > I think I am going to prefer "Text" as a class name. Unicode (and > intl/icu) have lots of operators acting on items containing unicode > strings. But they are really pieces of text. For example sentences, word > break iterators, etc. UString *feels* clunky, and not "standard". If > it's going to be part of PHP core, then we should pick a "core" name. (I > might prefer String, but that's going to cause a whole lot of issues > obviously).
Isn't this "solved" if we use \php\String? > > - "Needs More Methods" > > I had a look at the API that that links to, and I miss operators like > iterators. Over words, sentences, characters, etc. Basically the > functionality of > http://docs.php.net/manual/en/class.intlbreakiterator.php, > http://docs.php.net/manual/en/class.intlrulebasedbreakiterator.php and > http://docs.php.net/manual/en/class.intlcodepointbreakiterator.php > > I realize intl already immplements, this, but it's really beneficial to > have for a "Text" class - especially for replacing functionality where > people now look over a string - with a character index. > > - "Not a full String API Replacement" > > I would certainly expect more from it than just the UnicodeString API. > Perhaps not for a first iteration, but certainly for subsequent > versions. Things like transliterations, and specifically iterators would > be high on my list. > > - "Patch" > > toUpper/toLower, there is a missing one for toTitle > > - In the code's README: > > "Note: UString is interchangable with zend strings for method parameters > and can be cast for output/conversion to zend strings" > > How does that work? And what would it convert to? > > - How are "characters" counted? > > Is a character a Code Point, or is a character a base character + > combining diacritics. In the first form, A + ° is considered as > characters, in the second option, just one. For wordwrap, splice, > substring, it is really important that only the *full sequence* is > considered as a character. And hence, a character really should be the > full sequence. The text in "charAt" seems to contradict that, and that > is a mistake. > > In the original PHP 6 we didn't do that due to perormance reasons, but > that point is moot now as only people who opt into using "Text" will > suffer from this. > > - "trim" > > What is a leading or trailing space? Is it just U+0020, or other Unicode > defined space characters as well? ( , U+00A0 comes to mind here) > > - What is "UG(defaultpad)," about? > > - For the code: > > - there is some interesting, non standard whitespaceing going on: > > - { goes on next line after a func decl > - sometimes 4 spaces in stead of a tab are used for indentation, > > - Why is there no __toString() ? > > - How can other extensions, not really making use of "Text", use there > strings (as UTF8 strings f.e.) > > > cheers, > Derick > > > On Sat, 28 Feb 2015, Joe Watkins wrote: > > > Morning internals, > > > > This is just a quick note to announce my intention to ready this RFC > > for voting next week. > > > > I know I'm a little late maybe, I was real sick most of last week, so > > couldn't do anything useful. > > > > A couple of us intend to fix outstanding issues on github and those > > raised here, tidy the RFC and open the vote for 7. > > > > I would ask anyone interested to scan through this thread and announce > > concerns that are not mentioned asap. > > > > Cheers > > Joe > > > > On Fri, Oct 24, 2014 at 3:01 PM, Chris Wright <daveran...@php.net> wrote: > > > > > On 24 October 2014 07:03, Joe Watkins <pthre...@pthreads.org> wrote: > > > > > >> On Thu, 2014-10-23 at 12:54 -0700, Stas Malyshev wrote: > > >> > Hi! > > >> > > > >> > > P.S. u() is a bad name, will break lots of code, i.e. > > >> > > > >> > Maybe __u()? It's a bit ugly but you're not allowed to use __ so it's > > >> safe. > > >> > > > >> > > >> /me cringes ... > > >> > > >> I wonder how much of a problem it really is, usually when we say some > > >> function name is a problem is because of hundreds and hundreds of > > >> results on github. > > >> > > >> If it's a huge problem then we should rename it, if we have to dig > > >> around for a single project that's incompatible, or even a handful, then > > >> it's not really a problem. > > >> > > >> Cheers > > >> Joe > > > > > > > > > I can see this being something relatively common. While I personally would > > > never do it, there are a few reasons I can think of that people *might* do > > > it: > > > > > > - Wrapper for creating <u> HTML output > > > - urlencode() shortcut > > > - (obviously) various unicode-related things > > > > > > Searching on codesearch [1] revealed (amongst a few other hits on the > > > first page) another interesting use of it in the hhvm test suite [2]. It's > > > difficult to search for this because all the available public search > > > engines that I know of do fuzzy matching. > > > > > > Sorry. This sucks, because every other option we have for this is sucks. > > > > > > On the bright side, anything chosen could always be aliased at the top of > > > the file: > > > > > > use function __u as u; > > > > > > This also sucks, but it sucks a little bit less because the collisions are > > > avoided - or at least, avoided in such a way that the onus is on the user - > > > and one can still have the sane name. > > > > > > First-class support at the syntax level (presumably $foo = u"unicode > > > string" since we already have $foo = b"binary string") would IMO be better > > > and (hopefully?) a long-term goal, but I am aware that it is - and probably > > > should be - outside the scope of the current proposal. > > > > > > [1] https://searchcode.com/?q=function+u+lang%3Aphp > > > [2] > > > https://github.com/facebook/hhvm/blob/master/hphp/test/slow/ext_icu/uspoof.php#L13 > > > > > > > -- > http://derickrethans.nl | http://xdebug.org > Like Xdebug? Consider a donation: http://xdebug.org/donate.php > twitter: @derickr and @xdebug > Posted with an email client that doesn't mangle email: alpine > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php Cheers, Florian Margaine