I know this is utterly and absolutely absurd, but so it goes. El vie., 17 ene. 2020 a las 23:28, ToddAndMargo (<toddandma...@zoho.com>) escribió:
> Hi JJ, > > Please be my hero. > > I won't call you any goofy names out of > affection and friendship, as others will get > their nickers in a twist. > > This is from a previous conversation we had concerning > the mistake in > > https://docs.raku.org/language/nativecall#index-entry-nativecall > > my $string = "FOO"; > ... > my $array = CArray[uint8].new($string.encode.list); > > > Todd: > By the way, "C String" REQUIRES a nul at the end: > an error in the NativeCall documentation. > > JJ: > No, it does not. And even if it did, it should better > go to the C, not Raku, documentation > A C string literal is a C string. It automatically gets null terminated. I already mentioned this link https://stackoverflow.com/questions/8202897/null-terminated-string-in-c, but you don't seem to like and read links, so I copy the good bits here; "Is it absolutely necessary? *No*, because when you call scanf, strcpy(except for strncpy where you need to manually put zero if it exceeds the size), it copies the null terminator for you. Is it good to do it anyways? *Not really*, it doesn't really help the problem of bufferoverflow since those function will go over the size of the buffer anyways. Then what's the best way? use c++ with std::string." And another answer: "Always be careful to allocate enough memory with strings, compare the effects of the following lines of code: char s1[3] = "abc";char s2[4] = "abc";char s3[] = "abc"; All three are considered legal lines of code ( http://c-faq.com/ansi/nonstrings.htmlhttp://c-faq.com/ansi/nonstrings.html), but in the first case, there isn't enough memory for the fourth null-terminated character. s1 will not behave like a normal string, but s2 and s3 will. The compiler automatically count for s3, and you get four bytes of allocated memory. If you try to write " > > > And that would be a "String Literal", which is NOT > a C String. And C's documentation is precise and > clear (n1570). It is not their problem. It > is a mistake in NativeCall's documentation. > Did you really read what you wrote? A string literal is not a C string? And you want to add that to NativeCall documentation? You really seem to be driven by your need to prove you're right, than by a genuine wish to improve the documentation. Which I insist, is for Raku, not for C. > > Without the nul at the end, the string is considered > "undefined". > No, it's not. Please read the answer in StackOverflow above. If you use string literals and don't assign enough memory for the null termination, it's going to be undefined. That's the case for s1 above. Again, C stuff. There's enough work as it is now to document the finer points of Raku. Only with the "new" ( = 1 year old) 6.d Raku behavior, there're still almost 100 items to document. Document the finer points of C is totally outside the scope. Just read (maybe implicitly) at the beginning of NativeCall "Get your C right" and that's it. I'm not gonna go further than that. > > The C guys have been helping me with definitions. Chapter and > verse would be : > > INTERNATIONAL STANDARD ©ISO/IEC ISO/IEC 9899:201x > Programming languages — C > http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf > > 5.2.1 Character sets > 7.1.1 Definitions of terms > 6.7.9,p14 Semantics > > Here is your String Literal, which you are confusing > with a c string: > > 6.7.9,p14 states: > > An array of character type may be initialized by a > character string literal or UTF−8 string literal, > optionally enclosed in braces. Successive bytes > of the string literal (including the terminating > null character if there is room or if the array > is of unknown size) initialize the elements of > the array. > > So there is the unnecessary nul you speak of, except, > again, it is not a C String. So you do have a > Are you _really_ saying we should put parts of the C standard in the Raku documentation? Again, did you read what you wrote? somewhat of a point, although not a good one as > it will throw those trying to use the docs into > a state of confusion wondering what is going wrong, > as it did me. > They should have read the (implicit) notice at the beginning of NativeCall: "Get your C right" > It about killed me to figure it our myself. I > don't want others to go through the same pain > over a simple mistake in the documentation. > Well, an omission has suddenly be converted into a mistake. And an example which is totally correct (as proven several times) too. We do live in curious times. (Also, I didn't write that) > Now the C guys told me that the reason why I am not > getting anywhere with you is that I provided a bad > example. They proceeded to give me an example > that precisely shows the careening make by > the mistake in the documentation: > > > An example of an unterminated C string: > > Well, this example is even worse. On 2020-01-17 13:21, Bart wrote: > > > > <t2.c> > > #include <stdio.h> > > > > void foo(const char *s) > > { > > for (int i=0; i<10; ++i) > > printf("%d ",*s++); > > puts(""); > > > > } > > > > int main(void) > > { > > char str[3] = "FOO"; > Please read the StackOverflow post above (which I have copied, you don't even need to click on the link, don't worry). The behavior of that string is ambiguous _because you didn't allocate enough space for the string_. Not because it's not null-terminated. > foo(str); > > } > > </t2.c> > > > To compile: > gcc -o t2 t2.c > > To Run > t2 > > > > This prints the 3 characters codes of F,O,O in the string, plus 7 > > bytes that follow. Results on various compilers are as follows: > > > Ambiguous means it will behave in some way some times, some other times in others. So: > bcc: 70 79 79 0 0 0 0 0 0 0 > ^ Here's your null termination > tcc: 70 79 79 80 -1 18 0 0 0 0 > > gcc -O0: 70 79 79 112 -14 48 0 0 0 0 > > gcc -O3: 70 79 79 2 0 0 0 0 0 0 > > lcc: 70 79 79 0 0 0 0 0 0 0 > ^ Here's your null termination > dmc: 70 79 79 0 -120 -1 24 0 25 33 > ^What's this? Oh, it's null termination all over again > clang -O0: 70 79 79 16 6 64 0 0 0 0 > > clang -O2: 70 79 79 0 48 -120 73 6 -19 127 > ^ O2 is smart enough that it null-terminates it > msvc: 70 79 79 -29 -9 127 0 0 0 0 > > > Just change that to char str[4] = "FOO"; and it's going to be perfectly fine. Documentation is an arcane art, you know. Means that not only you have to be inclusive (if you need to), but totally precise. If I had included that example in the NativeCall documentation (which, let me be clear, I will not), it wouldn't have been an example for the need to null-terminate strings (which literals do automatically for you) but to get the size of the strings right. Hey, but I don't need that, because that's already implicit in the subtitle of NativeCall: "Get you C right." > The 70, 79, 79 are the F, O and O codes. When those 3 happen to > > be followed by 0, then it will appear to work. > > > > That is 4 out of 9, but the other 5 won't work. What follows > > after FOO is undefined and could be anything, although a random 0 is > common. > > > > JJ! He ran it through NINE C compilers! The careening > is OBVIOUS! > What is obvious is the ambiguous behavior of strings whose size is not declared correctly. > And this mistake is very easy to fix: > > Change > my $array = CArray[uint8].new($string.encode.list); > to > my $array = CArray[uint8].new($string.encode.list, 0); > > THREE characters `, 0` and it is fixed! And you are > The example works perfectly, and it does because it's a string literal which is already 0 terminated. Let's use this code instead of the one that I used in my other mail about this (which you probably didn't read anyway): ``` #include <stdio.h> void set_foo ( const char *foo) { printf("Printed directly %s\n", foo); for (int i=0; i<5; ++i) printf("%d ",*foo++); } ``` The Raku part will be the same: ``` use NativeCall; my $string = "FOO"; my $array = CArray[uint8].new($string.encode.list); say $array.elems; sub set_foo(CArray[uint8]) is native('const-char') { * } set_foo( $array ); ``` This prints: 3 Printed directly FOO 70 79 79 0 0 ``` What does this mean? It means that NativeCall does the right call (badum-tssss) and converts a Raku string literal into a C string literal, inserting the null termination even if we didn't. I actually don't care if it was the NativeCall API or the encode method. It just works. It gets allocated the right amount of memory, it gets passed correctly into the C realm. Just works. Since @array.elems has 3 elements, well, it might be rather the C part the one that does that. But I really don't care, and it does not really matter, and thus the example is correct, no need to add anything else to the documentation. Except maybe "get your C right" are finally conforming to n1570. And you will be > my ever living hero! (Watch some take offense to that!) > > Have I still not convinced you? THREE CHARACTERS !!!! > > Sorry for being such a pest about this. It about killed > me to figure out. > > Please be my hero. > Again, there's nothing to change in the documentation which besides, for starters, was a simple seudo-code which didn't compile anyway. But you can check out my "keepers", JJ/my-perl6-examples for those examples above, and many more. Cheers JJ