Thank you Tobias. This is what I was trying to get at, but wasn't sure _how_ to reach that conclusion. You've done so elegantly.
~Paul On Sat, Jan 18, 2020 at 7:55 AM Tobias Boege <t...@taboege.de> wrote: > On Sat, 18 Jan 2020, JJ Merelo wrote: > > The example works perfectly, and it does because it's a string literal > > which is already 0 terminated. Let's use this code instead of the one > that > > I used in my other mail about this (which you probably didn't read > anyway): > > > > 8< 8< 8< > > > > What does this mean? It means that NativeCall does the right call > > (badum-tssss) and converts a Raku string literal into a C string literal, > > inserting the null termination even if we didn't. I actually don't care > if > > it was the NativeCall API or the encode method. It just works. It gets > > allocated the right amount of memory, it gets passed correctly into the C > > realm. Just works. Since @array.elems has 3 elements, well, it might be > > rather the C part the one that does that. But I really don't care, and it > > does not really matter, and thus the example is correct, no need to add > > anything else to the documentation. Except maybe "get your C right" > > > > What you say seems to be correct: if you have a string literal in your > Raku code, this works for me, too. (Sometimes, see below.) > > BUT the terminating NUL character is not inserted by NativeCall and it > isn't inserted by &encode. If you run this program which uses a much > longer string that is not a literal on the face of it: > > use NativeCall; > sub c-printf(CArray[uint8]) is native is symbol<printf> { * }; > > my $string = "X" x 1994; > my $array = CArray[uint8].new($string.encode.list); > c-printf $array; > > through valgrind, it will warn you about a one-character invalid read, > that is a byte accessed by printf() which is at offset 0 after a properly > allocated block of size 1994: > > $ perl6-valgrind-m -MNativeCall -e 'sub c-printf(CArray[uint8]) is > native is symbol<printf> { * }; my $string = "X" x 1994; my $array = > CArray[uint8].new($string.encode.list); c-printf $array' >/dev/null > > ==325957== Invalid read of size 1 > ==325957== at 0x48401FC: strchrnul (vg_replace_strmem.c:1395) > ==325957== by 0x50CD334: __vfprintf_internal (in /usr/lib/ > libc-2.30.so) > ==325957== by 0x50BA26E: printf (in /usr/lib/libc-2.30.so) > ==325957== by 0x4B58048: ??? (in $rakudo/install/lib/libmoar.so) > ==325957== by 0x1FFEFFFB5F: ??? > ==325957== by 0x4B57F81: dc_callvm_call_x64 (in > $rakudo/install/lib/libmoar.so) > ==325957== by 0x50BA1BF: ??? (in /usr/lib/libc-2.30.so) > ==325957== by 0xA275E3F: ??? > ==325957== by 0x990153F: ??? > ==325957== by 0xA275E3F: ??? > ==325957== by 0x1FFEFFFB7F: ??? > ==325957== by 0x4B578D1: dcCallVoid (in > $rakudo/install/lib/libmoar.so) > ==325957== Address 0xb5ebf1a is 0 bytes after a block of size 1,994 > alloc'd > ==325957== at 0x483AD7B: realloc (vg_replace_malloc.c:836) > ==325957== by 0x4A9DFDF: expand.isra.3 (in > $rakudo/install/lib/libmoar.so) > ==325957== by 0x4A9E6F4: bind_pos (in > $rakudo/install/lib/libmoar.so) > ==325957== by 0x4A2C9AF: MVM_interp_run (in > $rakudo/install/lib/libmoar.so) > ==325957== by 0x4B2CC24: MVM_vm_run_file (in > $rakudo/install/lib/libmoar.so) > ==325957== by 0x109500: main (in $rakudo/install/bin/perl6-m) > > This is the NUL byte that happens to be there and terminate our string > correctly, but nothing in the moarvm process has allocated it, because > knowing what is allocated and what isn't is valgrind's job. And if it's > not allocated, then either moarvm routinely writes NULs to memory it > doesn't own or it simply does not automatically insert a NUL after the > storage of every CArray[uint8]. And why would it? I for one would not > expect CArray[uint8] to have special precautions built in for when it's > used to hold a C string. > > Why does it work with a string literal in the Raku code? I don't know, > but consider the following variations of the code, with my oldish Rakudo: > > - with $string = "X" x 1994: valgrind sad > - with $string = "X" x 4: valgrind sad > - with $string = "X" x 3: valgrind happy > - with $string = "X" x 2: valgrind happy > - with $string a short literal > like "FOO": valgrind happy > - with $string a literal of > sufficient length like "FOOO": valgrind sad > - with $string = "FOO" x 2: valgrind happy > - with $string = "FOO" x 200: valgrind sad > > My guess is that if it's sufficiently small and easy, then it is computed > at compile-time and stored somewhere in the bytecode, for which some > serialization routine ensures a trailing NUL byte inside an allocated > region of memory for the process. > > That is only a barely informed guesses, but independently of what causes > it to work on short string literals, I strongly believe that this is an > implementation detail and hence I would call the example in the docs > misleading. Appending the ", 0" when constructing the CArray[uint8] > seems like a really neat fix. > > Anyway, the most important message of this mail should actually be: > _Normally_, the way you pass a Raku string to a native function is by > using the `is encoded` trait. It's such a nice way offered by the > *language* to tell NativeCall to "do the right thing": > > sub c-printf(Str is encoded<ascii>) is native is symbol<printf> { * } > > It encodes and NUL terminates. Far preferable to cooking up your own > CArray[uint8]. As the documentation mentions, you only need to serialize > Raku strings to CArray[uint8] if you have to manage their lifetime beyond > the callee. I highly doubt that the Windows API requires you to do that > on the ordinary. > > Regards, > Tobias > > -- > "There's an old saying: Don't change anything... ever!" -- Mr. Monk > -- __________________ :(){ :|:& };: