Hello,
I think that Rowan is right: PHP users need to manipulate grapheme clusters
first (and code points in some rare situations). The fact that most of us
live in a world were NFC composes all our characters only hides this
reality.
A typical use case is a template engine: nearly all string man
On 15/10/14 15:58, Rowan Collins wrote:
Rowan,
What is confusing me is that i think you're seeing it as a major
implementation defect. To avoid arguable implementations, i've made
short example in Java:
System.out.println(new StringBuffer("noël").reverse().toString());
It does produce string
Aleksey Tulinov wrote (on 15/10/2014):
On 15/10/14 10:04, Rowan Collins wrote:
Rowan,
As I said at the top of my first post, the important thing is to capture
what those requirements actually are. Just as you'd choose what array
functions were needed if you were adding "array support" to a lan
On 15/10/14 10:04, Rowan Collins wrote:
Rowan,
As I said at the top of my first post, the important thing is to capture
what those requirements actually are. Just as you'd choose what array
functions were needed if you were adding "array support" to a language.
I'm sorry for not making mysel
>Good point. That's what i meant by border-line case. Could you possibly
>
>point me to a specific example of such false positive? I'm interested
>in
>well-formed UTF-8 string. I believe "noël" test is ill-formed UTF-8
>and
>doesn't conform to shortest-form requirement.
You're confusing two co
On 15/10/14 00:04, Rowan Collins wrote:
Rowan,
Back to combining characters, i dig the idea of introducing graphemes,
but i think French person would write word "noël" using precomposed
character. I'm using French keyboard at
https://translate.google.com/#fr/. "ë" is Shift + "^", then "e", it
p
On 14/10/14 23:48, Johannes Schlüter wrote:
On Tue, 2014-10-14 at 23:18 +0300, Aleksey Tulinov wrote:
Very good point. I'll give another example: is there a substring "s" in
string "Maße"? If it's case-sensitive search, when there is no such
substring, but if it's case-insensitive search, then
On 14/10/14 10:04, Aleksey Tulinov wrote:
> 1. Is there a need for more Unicode support in PHP?
> 2. What is currently missing in that regard?
> 3. Is this a good place to ask such questions?
I need to ask ...
Is this discussion only about improving support for UTF8 content in PHP?
What is the cu
On 14/10/2014 20:51, Andrea Faulds wrote:
If you went length in characters, you probably need to implement your own
algorithm, as it really depends on your specific use case.
I disagree, Unicode has very well-defined algorithms for these things,
and the average PHP developer (or even PHP fram
On 14/10/2014 21:18, Aleksey Tulinov wrote:
Back to combining characters, i dig the idea of introducing graphemes,
but i think French person would write word "noël" using precomposed
character. I'm using French keyboard at
https://translate.google.com/#fr/. "ë" is Shift + "^", then "e", it
pro
On Tue, 2014-10-14 at 23:18 +0300, Aleksey Tulinov wrote:
> Very good point. I'll give another example: is there a substring "s" in
> string "Maße"? If it's case-sensitive search, when there is no such
> substring, but if it's case-insensitive search, then "ß" folds into "ss"
> and substring "s"
On 14/10/14 21:01, Rowan Collins wrote:
Rowan,
As I've mentioned before, a lot of the time what people actually want to
deal with is "grapheme clusters" - the kind of thing that you'd think of
as a character if you were writing by hand. Most people, if asked the
length of the string "noël", wo
On 14 Oct 2014, at 19:01, Rowan Collins wrote:
>
>> If you want to see a pragmatic, actually working, work-in-progress attempt
>> at better PHP unicode support, see this: https://github.com/krakjoe/ustring
>
> It looks like a good prototype, but glancing at the documentation, I'm not
> clear
On 14/10/2014 14:50, Andrea Faulds wrote:
2. What is currently missing in that regard?
Unicode string support.
I know that was probably deliberately flippant, but I think there is a
genuine question to be asked here. A lot of people talk about "Unicode
support" like they talk about "XPath su
On 14/10/14 16:50, Andrea Faulds wrote:
If you want to see a pragmatic, actually working, work-in-progress attempt at
better PHP unicode support, see this: https://github.com/krakjoe/ustring
It would add a UString class to PHP for Unicode strings. This would make
Unicode text manipulation muc
On 14 October 2014 16:09, Aleksey Tulinov wrote:
> On 14/10/14 14:00, Chris Wright wrote:
>
> Chris,
>
>>> Latter is referring to difficulties like "excess memory usage" and
>>> "rewrite
>>> the language". I'm developing an open-source Unicode implementation
>>> library
>>> (nunicode), and it does
On 14/10/14 14:00, Chris Wright wrote:
Chris,
Latter is referring to difficulties like "excess memory usage" and "rewrite
the language". I'm developing an open-source Unicode implementation library
(nunicode), and it doesn't consume any heap at all, it also works on native
binary strings, as PH
On 14 Oct 2014, at 10:04, Aleksey Tulinov wrote:
> I would appreciate if someone would point me to a good read or explain
> collective opinion on this topic. I'm basically interested in the following
> questions:
>
> 1. Is there a need for more Unicode support in PHP?
Yes.
> 2. What is curr
On 14 October 2014 10:04, Aleksey Tulinov wrote:
> Hey,
>
> I can't find any recent discussion in this mailing list on this topic, i
> think that most close one is
> http://grokbase.com/t/php/php-internals/143b6aevsp/unicode-strings. I was
> also reading papers like that:
> http://www.infoworld.co
Hey,
I can't find any recent discussion in this mailing list on this topic, i
think that most close one is
http://grokbase.com/t/php/php-internals/143b6aevsp/unicode-strings. I
was also reading papers like that:
http://www.infoworld.com/article/2618358/application-development/php-5-4-emerges-
Hello all.
Attached is the patch which adds Unicode support to *printf() functions stack.
We (Andrei and me) made several assumptions that are worth mentioning:
sprintf() and vsprintf():
- use runtime_encoding when dealing with Unicode data.
printf() and vprintf():
- the result data is conver
21 matches
Mail list logo