Charusso added a comment. In D70411#1776460 <https://reviews.llvm.org/D70411#1776460>, @NoQ wrote:
> Ok, so, what should the checker warn on? What should be the intended state > machine for this checker? > > We basically have two possible categories of answers to this question: > > - "The checker warns every time it finds an execution path on which > unintended behavior occurs" - describe unintended behavior (say, > "out-of-bounds buffer access") and define the checker in such a way that all > warnings describe actual bugs of this kind (in our example, out-of-bound > buffer accesses) as long as the rest of the static analyzer works correctly. > - Or, you could say "The checker enforces a certain coding guideline on the > user" - and then you should describe the guideline in a very short and > easy-to-understand manner (say, "don't pass the string by pointer anywhere > before you null-terminate it"), and then make checker check exactly this > guideline. You may introduce some suppressions for intentional violations of > the guideline, but the guideline itself should be simple, it should be always > possible to follow it, and it should sound nice enough for developers to feel > good about themselves when they follow it. If I am right you are thinking about warn on the unsafe function calls which is the first case. I did that on by default. The idea is that, in a security manner we cannot allow hand-waving fixes. To measure stuff we could introduce options which serves the second case. I have added an option to suppress that common null-termination-by-hand idiom, so that I do not recommend anything. The CERT rules are not recommendations so I think now the warning is fine. > Yup, i mean, you can go as far as you want with generalizing over "fixed size > buffer" in this approach, but what i'm saying is that you're still not > addressing the whole train of thought about the length of the source string. I see, and you are right, but we are getting closer and closer to catch the most of. > In D70411#1773703 <https://reviews.llvm.org/D70411#1773703>, @Charusso wrote: > >> > F10967277: ftp.c_8cb07866de61ef56be82135a4d3a5b7e.plist.html >> > <https://reviews.llvm.org/F10967277> >> > The source string is an IP address and port, which has a known limit on >> > the number of digits it can have. >> >> The known size does not mean that the string is going to be null-terminated. > > > It does, because `strcpy` is guaranteed to null-terminate as long as it has > enough storage capacity. It does not, because the storage capacity is arbitrary. If we would be sure the **copied stuff's length** cannot be larger than the destination's arbitrary capacity, we would not warn. There are tests for that case. > Checks that are part of the generic taint checker are currently in a pretty > bad shape, but the taint analysis itself is pretty good, and checks that rely > on taint but aren't part of the generic taint checker are also pretty good. I > actually believe taint analysis could be enabled by default as soon as > somebody goes through the broken checks and disables/removes them. If you > rely on the existing taint analysis infrastructure and make a good check, > that'll be wonderful and would further encourage us to move taint analysis > out of alpha. I think it is too far away, sadly. I like the idea because it would be the root of security checkers, but I am mostly interested in data-flow analysis. ================ Comment at: clang/test/Analysis/cert/str31-c-fp-suppression.cpp:51-53 + strcpy(dest, src); + do_something(dest); + // expected-warning@-1 {{'dest' is not null-terminated}} ---------------- NoQ wrote: > Charusso wrote: > > NoQ wrote: > > > Looks like a false positive. Consider the following perfectly correct > > > (even if a bit inefficient) code that only differs from the current code > > > only in names and context: > > > ```lang=c++ > > > void copy_and_test_if_short(const char *src) { > > > char dest[128]; > > > strcpy(dest, src); > > > warn_if_starts_with_foo(dest); > > > } > > > > > > void copy_and_test(const char *src) { > > > if (strlen(src) < 64) > > > copy_and_test_if_short(src); > > > else > > > copy_and_test_if_long(src); > > > } > > > ``` > > That is why I have asked whether we have a way to say "when the string is > > read". I like that heuristic for now, because any kind of not > > null-terminated string is the root of the evil for [[ > > https://wiki.sei.cmu.edu/confluence/display/c/STR31-C.+Guarantee+that+storage+for+strings+has+sufficient+space+for+character+data+and+the+null+terminator > > | STR31-C ]]. > > > > In this case the attacker is the programmer who sends a possibly not > > null-terminated string around and the function which receives an arbitrary > > input would carry the issue. > > > > Thanks for the example, I wanted to cover ranges, but I have forgotten it, > > because in the wild peoples do not check for bounds. Copy-pasted. > Note that function `copy_and_test` in my example would not be necessarily > visible to you. It may be in another translation unit. > > So in order to "cover ranges" you'll have to discard all cases where the > length is completely unknown. That unknowness idea is cool, I will leave that comment as not `Done`. ================ Comment at: clang/test/Analysis/cert/str31-c-notes.cpp:22 + if (gets(buf)) {} + // expected-note@-1 {{'gets' could write outside of 'buf'}} + // expected-note@-2 {{Assuming the condition is false}} ---------------- NoQ wrote: > Charusso wrote: > > NoQ wrote: > > > I still believe that this should be *the* warning in this code. This is > > > already broken code. > > No, it is not broken, sadly. I would say it is broken, but this checker > > tries to be smarter than `grep`. Truncation is fine for [[ > > https://wiki.sei.cmu.edu/confluence/display/c/STR31-C.+Guarantee+that+storage+for+strings+has+sufficient+space+for+character+data+and+the+null+terminator > > | STR31-C ]]. > > > > That change in my mindset made this checker possible when I have dropped > > the idea of warn on function calls. > > `Truncation is fine for STR31-C.` > This isn't just truncation, this is buffer overflow, and it's definitely not > fine if we want to "Guarantee that storage for strings has sufficient space > for character data". > > This is absolutely the right place to simply warn on every invocation of > `gets()`. I have found out we have another recommendation for that type of issues: https://wiki.sei.cmu.edu/confluence/display/c/STR03-C.+Do+not+inadvertently+truncate+a+string Here the truncation is a truncation, but in a security manner we cannot rely on feelings, so we warn from now. ================ Comment at: clang/test/Analysis/cert/str31-c-notes.cpp:26-27 + + free(buf); + // expected-warning@-1 {{'buf' is not null-terminated}} + // expected-note@-2 {{'buf' is not null-terminated}} ---------------- NoQ wrote: > Charusso wrote: > > NoQ wrote: > > > It is not wrong to free a buffer that isn't null-terminated. We shouldn't > > > warn here. > > That is when the string is read. We could have a list of function calls > > which are non-problematic. For now I am fine with that note, because you do > > not allocate something without reading it before freeing it. It is just an > > arbitrary test for now. > > That is when the string is read. > > That strikes me as an example of [[ > https://en.wikipedia.org/wiki/Law_of_the_instrument | the Maslow's hammer ]]. > The static analyzer isn't great at guessing whether the function reads from > the string or not. For a dynamic analysis it wouldn't have been a problem. > > That said, maybe you could get away with discriminating between `char *` and > `const char *`. If the string is passed in by const pointer, there's nothing > you can do except read from it. If it's passed in by non-const pointer, you > can conservatively assume that it isn't read. Cool idea, thanks! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70411/new/ https://reviews.llvm.org/D70411 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits