uot; feature, it can be tricky to predict where you're
> going to end up, because the point you're starting at depends on what
> kind text you've been reading, not just the number of bytes you've
> read.
>
> Is that making any sense? I posted a later code examp
On zaterdag 25 april 2020 21:51:41 CEST Joseph Brenner wrote:
> > Yary has an issue posted regarding 'display-width' of UTF-16 encoded
strings:
> > https://github.com/rakudo/rakudo/issues/3461
> >
> > I know it might be far-fetched, but what if your UTF-8 issue and
>
> Yary's UTF-16 issue wer
So MoarVM uses its own database of the UCD. One nice thing is this can
probably be faster than calling to the ICU to look up information of each
codepoint in a long string. Secondly it implements its own text data
structures, so the nice features of the UCD to do that would be difficult to
use.
I bisected MoarVM and the offending commit is here:
https://github.com/MoarVM/MoarVM/commit/c98634cf2542874d7daa5b45f77f7de4cf04a081
From what I see, this commit did not actually cause the root bug, it just
exposed it.
The Unicode Database was rebuilt so that NFG_QC=False for Emoji characters,
I bisected MoarVM and the offending commit is here:
https://github.com/MoarVM/MoarVM/commit/c98634cf2542874d7daa5b45f77f7de4cf04a081
From what I see, this commit did not actually cause the root bug, it just
exposed it.
The Unicode Database was rebuilt so that NFG_QC=False for Emoji characters,
Actually it has nothing to with <$a>, and this triggers it as well:
m: ' ' ~~ m:s/ /;
OUTPUT«===SORRY!=== Error while compiling Null regex not allowedat
:1--> ' ' ~~ m:s/ ⏏/;»
If this is indeed a bug, this should probably be renamed.
https://design.perl6.org/S05.html Reading this again, it seems that leading
whitespace is ignored. It says:
"The new :s (:sigspace) modifier causes certain whitespace sequences to be
considered "significant"; they are replaced by a whitespace matching rule,
<.ws>. Only whitespace sequences immed
It looks like according to the Unicode grapheme things, ‘degenerates’ do not
have to be accounted for in supported the spec.
> Ignore degenerates. No special provisions are made to get marginally better
behavior for degenerate cases that never occur in practice, such as an A
followed by an Indi
I have fixed this in https://github.com/MoarVM/MoarVM/pull/469
There are already tests, but once this is accepted this issue should be
closed. Cheers.
I have fixed this in https://github.com/MoarVM/MoarVM/pull/469
There are already tests, but once this is accepted this issue should be
closed. Cheers.
signature.asc
Description: This is a digitally signed message part.
What is going on here is not a bug in string repetition, but a bug in
converting from List to a Str object.
say ("\c[REGIONAL INDICATOR SYMBOL LETTER G]" xx 2).elems #> 2
say ("\c[REGIONAL INDICATOR SYMBOL LETTER G]" xx 2)[0].ords #> 127468
say ("\c[REGIONAL INDICATOR SYMBOL LETTER G]" xx 2)[0].c
I have fixed it on the JVM as of NQP commit:
# Fix RT #117683 on JVM \c[LINE FEED] \c[CARRIAGE RETURN]
#Also fixes \c[NEXT LINE] as well.
https://github.com/perl6/nqp/commit/0c249e7236a63325e6440df55a762a4378e6e63a
Fixed on MoarVM as of MoarVM commit:
# Fix RT #117683 \c[LINE FEED] \c[CARRIAGE
This has been fixed on MoarVM as of
https://github.com/MoarVM/MoarVM/commit/816186484b5cc52f9ff1be6afa3b6f49264335bf
BELL now resolves to 🔔 U+1F514 on MoarVM, but this is still broken on the JVM
This seems a little different than
https://rt.perl.org/Ticket/Display.html?id=130483
Digit resolves to the Numeric_Type property, whose uniprop-int value is 0 for
non-numbers.
<:Digit> and <:Numeric_Type> both match everything. Will need more
investigation.
On Saturday, 14 January 2017 02.06.57 PST you wrote:
> > BELL now resolves to 🔔 U+1F514 on MoarVM, but this is still broken on the
> > JVM
>
> What causes this kind of difference?
>
>
>
U+0007's Unicode 1 name was BELL, and with version 2 the name was removed.
Unicode 1 names are essentiall
There is a new roast test, S15-nfg/emoji-test.t
We used to fail 1500+/1943 tests, but now we only fail 275 tests. Will keep
this open until we are passing all the Emoji tests which contain ZWJ characters
On Wednesday, 25 January 2017 01.45.59 PST you wrote:
> On Tue, 24 Jan 2017 23:15:32 -0800, samant...@posteo.net wrote:
> > CODE:
> > my Seq $thing = (1,3,4).Seq; $thing.iterator; $thing.iterator
> >
> > STDERR:
> > This Seq has already been iterated, and its values consumed
> > (you might solve t
The second code should have been:
$_ = (' a ', ' b '); .».trim.perl.say
which does have the error message.
I bisected MoarVM and the offending commit is here:
https://github.com/MoarVM/MoarVM/commit/c98634cf2542874d7daa5b45f77f7de4cf04a081
>From what I see, this commit did not actually cause the root bug, it just
>exposed it.
The Unicode Database was rebuilt so that NFG_QC=False for Emoji characters
On Mon, 08 Aug 2016 17:34:57 -0700, timo wrote:
> to be more precise, the way we code-gen "literal" qregex nodes with
> subtype "ignoremark+ignorecase" will only ever check the ordbaseat of
> the first character in the literal against the haystack.
>
This has been fixed as of https://github.com/
> Result:
> Malformed UTF-8 at line 1 col 1029
> in block at ./test.pl:2
Bug has been open a while, and I have not forgotten it, I had just not reached
a final decision. After further thought I'm closing this WONTFIX. It would
needlessly complicate our grapheme concatenation and in addition I believe it
may break some of the grapheme concatenation tests.
I have fixed this as of this MoarVM commit:
https://github.com/MoarVM/MoarVM/commit/712cff3341270362b808ba0f4c519f4557a4671d
Full explaination in the commit description.
Thanks a lot for reporting this bug :)
On Thu, 07 Sep 2017 09:52:07 -0700, sml...@gmail.com wrote:
> On Wed, 06 Sep 2017 15:20:17 -0700, coke wrote:
> > With a recent rakudo, these now both output 1
>
> Bisectable shows that it was fixed during recent MoarVM changes:
>
> https://gist.github.com/Whateverable/01a82d07e8009c7beffe5893432
On Fri, 14 Oct 2016 11:06:54 -0700, c...@cpan.org wrote:
> Cf
>
> say ("\c[REGIONAL INDICATOR SYMBOL LETTER G]" x 2).chars #=> 2
>
> vs
>
> say ([~] "\c[REGIONAL INDICATOR SYMBOL LETTER G]" xx 2).chars #=> 1
>
What is going on here is not a bug in string repetition, but a bug in
conv
Actually this is not a bug at all, and it is not limited to those characters.
If you do ('a' xx 2).chars you will get 3 as well. If you want to join the list
after you create it:
say ('a' xx 2).join.chars #> 3
Rejecting.
26 matches
Mail list logo