I've been going through old Linux kernel CVEs, trying to prototype some possible new warnings for -fanalyzer in GCC 12 (and, alas, finding places where the analyzer internals need work...)
I think I want a way for the user to be able to mark security boundaries in their code: for example: * in the Linux kernel the boundary between untrusted user-space data and kernel data, or, * for a user-space daemon, the boundary between data coming from the network and the data of daemon itself The analyzer could then make use of this, for example: (a) marking untrusted incoming data as "tainted" and prioritizing analysis of paths that make use of it (e.g. a "might overflow a buffer when N is really large" goes from being a noisy false positive when we simply have no knowledge of N (or the buffer's size) to being a serious issue if N is under the control of an attacker (b) copying uninitialized data back to the untrusted region becomes a potential disclosure of sensitive information I think I also want a way to mark system calls and ioctl implementations, so that I mark all of the parameters as being potentially hostile. Specifically, the Linux kernel uses functions like this: #define __user extern long copy_to_user(void __user *to, const void *from, unsigned long n); extern long copy_from_user(void *to, const void __user *from, long n); in various places, so I want a way to mark the "to" and "from" params as being a security boundary. I've been experimenting with implementing (b) for CVE-2011-1078 (in which a copy_to_user is passed a pointer to an on-stack buffer that isn't fully initialized, hence a disclosure of information to user- space). Martin: I believe you added __attribute__((access)) in GCC 9. I was thinking of extending it to allow something like: #define __user extern long copy_to_user(void __user *to, const void *from, unsigned long n) __attribute__((access (untrusted_write, 1, 3), access (read_only, 2, 3) )); extern long copy_from_user(void *to, const void __user *from, long n) __attribute__((access (write_only, 1, 3), access (untrusted_read, 2, 3) )); so that to "to" and "from" and marked as being writes and reads of up to size n, but they are flagged as "untrusted" as appropriate, so the analyzer can pay particular attention as described above. Does the above idea sound like a sane extension of the access attribute? I tried implementing it, but "access" seems to get converted to its own microformat for expressing these things as strings (created via append_access_attr, and parsed in e.g. init_attr_rdwr_indices), which seems to make it much harder than I was expecting. Any thoughts about how to mark system calls/ioctls? The simplest would be an attribute that marks all parameters as being untrusted, and the return value, somehow. Thanks Dave