El sáb., 18 ene. 2020 a las 13:55, Tobias Boege (<t...@taboege.de>)
escribió:

> On Sat, 18 Jan 2020, JJ Merelo wrote:
> > The example works perfectly, and it does because it's a string literal
> > which is already 0 terminated. Let's use this code instead of the one
> that
> > I used in my other mail about this (which you probably didn't read
> anyway):
> >
> > 8< 8< 8<
> >
> > What does this mean? It means that NativeCall does the right call
> > (badum-tssss) and converts a Raku string literal into a C string literal,
> > inserting the null termination even if we didn't. I actually don't care
> if
> > it was the NativeCall API or the encode method. It just works. It gets
> > allocated the right amount of memory, it gets passed correctly into the C
> > realm. Just works. Since @array.elems has 3 elements, well, it might be
> > rather the C part the one that does that. But I really don't care, and it
> > does not really matter, and thus the example is correct, no need to add
> > anything else to the documentation. Except maybe "get your C right"
> >
>
> What you say seems to be correct: if you have a string literal in your
> Raku code, this works for me, too. (Sometimes, see below.)
>
> BUT the terminating NUL character is not inserted by NativeCall and it
> isn't inserted by &encode. If you run this program which uses a much
> longer string that is not a literal on the face of it:
>
>     use NativeCall;
>     sub c-printf(CArray[uint8]) is native is symbol<printf> { * };
>
>     my $string = "X" x 1994;
>     my $array = CArray[uint8].new($string.encode.list);
>     c-printf $array;
>
> through valgrind, it will warn you about a one-character invalid read,
> that is a byte accessed by printf() which is at offset 0 after a properly
> allocated block of size 1994:
>
>     $ perl6-valgrind-m -MNativeCall -e 'sub c-printf(CArray[uint8]) is
> native is symbol<printf> { * }; my $string = "X" x 1994; my $array =
> CArray[uint8].new($string.encode.list); c-printf $array' >/dev/null
>
>     ==325957== Invalid read of size 1
>     ==325957==    at 0x48401FC: strchrnul (vg_replace_strmem.c:1395)
>     ==325957==    by 0x50CD334: __vfprintf_internal (in /usr/lib/
> libc-2.30.so)
>     ==325957==    by 0x50BA26E: printf (in /usr/lib/libc-2.30.so)
>     ==325957==    by 0x4B58048: ??? (in $rakudo/install/lib/libmoar.so)
>     ==325957==    by 0x1FFEFFFB5F: ???
>     ==325957==    by 0x4B57F81: dc_callvm_call_x64 (in
> $rakudo/install/lib/libmoar.so)
>     ==325957==    by 0x50BA1BF: ??? (in /usr/lib/libc-2.30.so)
>     ==325957==    by 0xA275E3F: ???
>     ==325957==    by 0x990153F: ???
>     ==325957==    by 0xA275E3F: ???
>     ==325957==    by 0x1FFEFFFB7F: ???
>     ==325957==    by 0x4B578D1: dcCallVoid (in
> $rakudo/install/lib/libmoar.so)
>     ==325957==  Address 0xb5ebf1a is 0 bytes after a block of size 1,994
> alloc'd
>     ==325957==    at 0x483AD7B: realloc (vg_replace_malloc.c:836)
>     ==325957==    by 0x4A9DFDF: expand.isra.3 (in
> $rakudo/install/lib/libmoar.so)
>     ==325957==    by 0x4A9E6F4: bind_pos (in
> $rakudo/install/lib/libmoar.so)
>     ==325957==    by 0x4A2C9AF: MVM_interp_run (in
> $rakudo/install/lib/libmoar.so)
>     ==325957==    by 0x4B2CC24: MVM_vm_run_file (in
> $rakudo/install/lib/libmoar.so)
>     ==325957==    by 0x109500: main (in $rakudo/install/bin/perl6-m)
>
> This is the NUL byte that happens to be there and terminate our string
> correctly, but nothing in the moarvm process has allocated it, because
> knowing what is allocated and what isn't is valgrind's job. And if it's
> not allocated, then either moarvm routinely writes NULs to memory it
> doesn't own or it simply does not automatically insert a NUL after the
> storage of every CArray[uint8]. And why would it? I for one would not
> expect CArray[uint8] to have special precautions built in for when it's
> used to hold a C string.
>
> Why does it work with a string literal in the Raku code? I don't know,
> but consider the following variations of the code, with my oldish Rakudo:
>
>   - with $string = "X" x 1994:     valgrind sad
>   - with $string = "X" x 4:        valgrind sad
>   - with $string = "X" x 3:        valgrind happy
>   - with $string = "X" x 2:        valgrind happy
>   - with $string a short literal
>     like "FOO":                    valgrind happy
>   - with $string a literal of
>     sufficient length like "FOOO": valgrind sad
>   - with $string = "FOO" x 2:      valgrind happy
>   - with $string = "FOO" x 200:    valgrind sad
>
> My guess is that if it's sufficiently small and easy, then it is computed
> at compile-time and stored somewhere in the bytecode, for which some
> serialization routine ensures a trailing NUL byte inside an allocated
> region of memory for the process.
>
> That is only a barely informed guesses, but independently of what causes
> it to work on short string literals, I strongly believe that this is an
> implementation detail and hence I would call the example in the docs
> misleading. Appending the ", 0" when constructing the CArray[uint8]
> seems like a really neat fix.
>
> Anyway, the most important message of this mail should actually be:
> _Normally_, the way you pass a Raku string to a native function is by
> using the `is encoded` trait. It's such a nice way offered by the
> *language* to tell NativeCall to "do the right thing":
>
>   sub c-printf(Str is encoded<ascii>) is native is symbol<printf> { * }
>
>
Would you be so kind to post this as an issue in the documentation, so we
can pick up on it?

Thanks!

JJ

Reply via email to