Hello, This patch series is a WORK-IN-PROGRESS towards porting the LLVM hardware address sanitizer (HWASAN) in GCC. The document describing HWASAN can be found here http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html.
The current patch series is far from complete, but I'm posting the current state to provide something to discuss at the Cauldron next week. In its current state, this sanitizer only works on AArch64 with a custom kernel to allow tagged pointers in system calls. This is discussed in the below link https://source.android.com/devices/tech/debug/hwasan -- the custom kernel allows tagged pointers in syscalls. I have also not yet put tests into the DejaGNU framework, but instead have a simple test file from which the tests will eventually come. That test file is attached to this email despite not being in the patch series. Something close to this patch series bootstraps and passes most regression tests when ~--with-build-config=bootstrap-hwasan~ is used. The regressions it doesn't pass are all the other sanitizer tests and all linker plugin tests. The linker plugin tests fail due to a configuration problem where the library path is not correctly set. (I say "something close to this patch series" because I recently made a change that breaks bootstrap but I believe is the best approach once I've fixed it, hence for an RFC I'm leaving it in). HWASAN works by storing a tag in the top bits of every pointer and a colour in a shadow memory region corresponding to every area of memory. On every memory access through a pointer the tag in the pointer is checked against the colour in shadow memory corresponding to the memory the pointer is accessing. If the tag and colour do not match then a fault is signalled. The instrumentation required for this sanitizer has a large overlap with the instrumentation required for implementing MTE (which has similar functionality but checks are automatically done in the hardware and instructions for colouring shadow memory and for managing tags are provided by the architecture). https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-a-profile-architecture-2018-developments-armv85a We hope to use the HWASAN framework to implement MTE tagging on the stack, and hence I have a "dummy" patch demonstrating the approach envisaged for this. Though there is still much to implement here, the general approach should be clear. Any feedback is welcomed, but I have three main points that I'm particularly hoping for external opinions. 1) The current approach stores a tag on the RTL representing a given variable, in order to implement HWASAN for x86_64 the tag needs to be removed before every memory access but not on things like function calls. Is there any obvious way to handle removing the tag in these places? Maybe something with legitimize_address? 2) The first draft presented here introduces a new RTL expression called ADDTAG. I now believe that a hook would be neater here but haven't yet looked into it. Do people agree? (addtag is introduced in the patch titled "Put tags into each stack variable pointer", but the reason it's introduced is so the backend can define how this gets implemented with a ~define_expand~ and that's only needed for the MTE handling as introduced in "Add in MTE stubs") 3) This patch series has not yet had much thought go towards it around command line arguments. I personally quite like the idea of having ~-fsanitize=hwaddress~ turn on "checking memory tags against shadow memory colour", and MTE being just a hardware acceleration of this ability. I suspect this idea wouldn't be liked by all and would like to hear some opinions. Thanks, Matthew
#include <assert.h> #include <stdarg.h> #include <stddef.h> #include <stdio.h> #include <string.h> #include <stdlib.h> #include <stdint.h> struct two_values { int left; int right; }; struct big_struct { int left; int right; int big_array[100]; }; struct bitmapped_struct { unsigned one : 1; unsigned two : 1; unsigned three : 1; unsigned four : 1; unsigned five : 1; unsigned six : 1; unsigned seven : 1; unsigned eight : 1; }; uint8_t tag_of (void * x) { return ((uintptr_t)x) >> 56; } /* Tests of nested funtions are: 0) Accessing closed over variables works. 1) Accesses outside of variables is caught. 2) Accessing variable out of scope is caught. */ int __attribute__ ((noinline)) intermediate (void (*f) (int, uint8_t), uint8_t num) { if (num == 1) f (20, 100); else f (3, 100); /* Just return something ... */ return num % 3; } int* __attribute__ ((noinline)) nested_function (uint8_t num) { int big_array[16]; int other_array[16]; void store (int index, uint8_t value) { big_array[index] = value; } return &other_array[intermediate (store, num)]; } int __attribute__ ((noinline)) test_nested (uint8_t check_mode) { assert (check_mode < 3); int *retval = nested_function (check_mode); if (check_mode == 0) return 0; else if (check_mode == 2) *retval = 100; /* For check_mode anything other than 1 we should have caught an error. */ return 1; } #include <setjmp.h> #include <stdio.h> /* Testing longjmp/setjmp should test. 0) Nothing special happens with the jmp_buf. 1) Accesses to scopes jmp'd over are caught. */ int __attribute__ ((noinline)) uses_longjmp (int **other_array, int num, jmp_buf env) { int internal_array[100] = {0}; *other_array = &internal_array[0]; if (num % 2) longjmp (env, num); else return num % 8; } int __attribute__ ((noinline)) uses_setjmp (int num) { int big_array[100]; int *other_array = NULL; sigjmp_buf cur_env; int temp = 0; if ((temp = sigsetjmp (cur_env, 1)) != 0) { if (other_array != NULL) printf ("Value pointed to in other_array[0]: %d\n", other_array[0]); else puts ("other_array was not initialised!"); printf ("You gave %d arguments.\n", temp); return 10; } else { return uses_longjmp (&other_array, num, cur_env); } } int __attribute__ ((noinline)) test_longjmp (uint8_t check_mode) { assert (check_mode < 2); if (check_mode) { uses_setjmp (1); return 1; } else nested_function (0); return 0; } /* Basic tests for stack tagging. 0) Accesses outside of a variable crash. 1) Valid accesses work. */ int __attribute__ ((noinline)) accessing_pointers (int *left, int *right) { int x = right[2]; left[3] = right[1]; return right[1] + left[2]; } int __attribute__ ((noinline)) using_stack (int num) { int big_array[10]; int other_array[20]; accessing_pointers(other_array, big_array); return big_array[num]; } int __attribute__ ((noinline)) test_basic_stack (uint8_t check_mode) { assert (check_mode < 2); if (check_mode) { using_stack (5); return 0; } else using_stack (17); return 1; } /* Tests for alloca are: 0) alloca is given a different tag to other variables. 1) entire alloca array is accessible with pointer returned. 2) Outside of alloca array is not accessible (once cross 16 byte alignment). */ #include <alloca.h> int __attribute__ ((noinline)) check_alloca (int num) { int *allocd_array = alloca (num * sizeof(int)); int other_array[10]; if (num % 2) { return allocd_array[num + 2]; } return other_array[12]; } int __attribute__ ((noinline)) alloca_different_tag (int num) { struct two_values tmp_object = { .left = 100, .right = num, }; int *big_array = alloca (num * sizeof (int)); int other_array[100]; uint8_t first_tag = tag_of (&tmp_object); uint8_t second_tag = tag_of (big_array); uint8_t other_tag = tag_of (other_array); assert (first_tag != second_tag); assert (second_tag != other_tag); assert (first_tag != other_tag); return 0; } int __attribute__ ((noinline)) using_alloca (int num) { int retval = 0; int *big_array = alloca (num * sizeof (int)); for (int i = 0; i < num; ++i) { retval += big_array[i]; } return retval; } int __attribute__ ((noinline)) test_alloca (uint8_t check_mode) { assert (check_mode < 3); if (check_mode == 1) using_alloca (16); else if (check_mode == 0) alloca_different_tag (check_mode); else { check_alloca (3); return 1; } return 0; } /* Tests for variable arrays are: SEE ABOVE REQUIREMENTS (alloca). */ #include <alloca.h> int __attribute__ ((noinline)) check_vararray (int num) { int var_array[num]; int other_array[10]; if (num % 2) { return var_array[num + 2]; } return other_array[12]; } int __attribute__ ((noinline)) vararray_different_tag (int num) { struct two_values tmp_object = { .left = 100, .right = num, }; int big_array[num]; int other_array[100]; uint8_t first_tag = tag_of (&tmp_object); uint8_t second_tag = tag_of (big_array); uint8_t other_tag = tag_of (other_array); assert (first_tag != second_tag); assert (second_tag != other_tag); assert (first_tag != other_tag); return 0; } int __attribute__ ((noinline)) using_vararray (int num) { int retval = 0; int big_array[num]; for (int i = 0; i < num; ++i) { retval += big_array[i]; } return retval; } int __attribute__ ((noinline)) test_vararray (uint8_t check_mode) { assert (check_mode < 3); if (check_mode == 1) using_vararray (16); else if (check_mode == 0) vararray_different_tag (check_mode); else { check_vararray (3); return 1; } return 0; } /* Tests for RVO 0) The value is accessible in the given function. 1) RVO does happen. 2) The pointer for both caller and callee are the same. */ struct big_struct __attribute__ ((noinline)) return_on_stack() { struct big_struct x; x.left = 100; x.right = 20; x.big_array[10] = 30; return x; } struct big_struct __attribute__ ((noinline)) unnamed_return_on_stack() { return (struct big_struct){ .left = 100, .right = 20, .big_array = {0} }; } /* Misc tests that I added each time I made a mistake in the implementation and broke something. */ int __attribute__ ((noinline)) two_items_on_stack (int num) { int left_array[100]; int right_array[200]; return left_array[num] + right_array[num]; } int __attribute__ ((noinline)) stack_object_use(int direction) { struct two_values stack_object = { .left = 1, .right = 2, }; int *ptr = (int *)0; if (direction) { ptr = &stack_object.left; } else { ptr = &stack_object.right; } *ptr = direction; if (direction & 2) { return stack_object.right; } else { return stack_object.left; } } struct two_values __attribute__ ((noinline)) basic_stack_object (int direction) { struct two_values x = { .left = 1, .right = 2, }; x.left += direction; return x; } int __attribute__ ((noinline)) set_value_to_random(int *value_p) { int big_array[100]; *value_p = big_array[10]; return 1; } /* Ensure my instrumentation doesn't crash on unaligned access (an older version with a completely different approach didn't handle this well). */ int __attribute__ ((noinline)) handle_unaligned_access (struct bitmapped_struct *foo) { if (foo->three) return foo->four; foo->five = 1; return 1; } /* Handling large aligned variables. Large aligned variables take a different code-path through expand_stack_vars in cfgexpand.c. This testcase is just to exercise that code-path. The alternate code-path produces a second base-pointer through some instructions emitted in the prologue. Test cases are: 0) Valid access works without complaint. 1) Invalid access is caught. */ int __attribute__ ((noinline)) handle_large_alignment (int num) { int other_array[10]; int big_array[100] __attribute__ ((aligned (32))); return big_array[num] + other_array[num]; } int __attribute__ ((noinline)) test_large_alignment (uint8_t check_mode) { assert (check_mode < 2); if (check_mode) { handle_large_alignment (11); return 1; } else handle_large_alignment (1); return 0; } /* {{{ This is a reduced testcase from an ICE a while ago. At the time I was modifying the GIMPLE and made an assumption that all statements used all their operands. This test is redundant now, but there's no harm in still compiling it. */ #include <string.h> #include <stdint.h> const short DPD2BIN[1024] = {0}; const char BIN2CHAR[4001] = {0}; char * decimal64ToString(char *string) { char *c; /* output pointer in string */ const uint8_t *u; /* work */ int32_t dpd; /* .. */ u=(const uint8_t *)&BIN2CHAR[DPD2BIN[dpd]*4]; if (string[3]) {memcpy(c, u+1, 4); c+=3;} else if (string[4]) {memcpy(c, u+4-*u, 4); c+=string[4];} return string; } int __attribute__ ((noinline)) variables_same_partition () { int ret = 0; { struct bitmapped_struct orig; ret = handle_unaligned_access (&orig); } { struct bitmapped_struct other; ret += handle_unaligned_access (&other); } return ret; } /* }}} */ /* {{{ Unrecogniseable insn problem. This code is reduced from something in the bootstrap that triggered a problem where my backend patterns didn't handle very large offsets. */ #ifndef ASIZE # define ASIZE 0x10000000000UL #endif #include <limits.h> #if LONG_MAX < 8 * ASIZE # undef ASIZE # define ASIZE 4096 #endif extern void abort (void); int __attribute__((noinline)) foo (const char *s) { if (!s) return 1; if (s[0] != 'a') abort (); s += ASIZE - 1; if (s[0] != 'b') abort (); return 0; } int (*fn) (const char *) = foo; int __attribute__((noinline)) bar (void) { char s[ASIZE]; s[0] = 'a'; s[ASIZE - 1] = 'b'; foo (s); foo (s); return 0; } int __attribute__((noinline)) baz (long i) { if (i) return fn (0); else { char s[ASIZE]; s[0] = 'a'; s[ASIZE - 1] = 'b'; foo (s); foo (s); return fn (0); } } /* }}} */ /* {{{ Non-constant sizes. Code designed to store SVE registers on the stack. This is needed to exercise the poly_int64 handling for HWASAN and MTE instrumentation. */ int u; void foo_sve (int *p) { int i; #pragma omp for simd lastprivate(u) schedule (static, 32) for (i = 0; i < 1024; i++) u = p[i]; } void bar_sve (int *p) { int i; #pragma omp taskloop simd lastprivate(u) for (i = 0; i < 1024; i++) u = p[i]; } /* }}} */ /* An old RTL testcase. This was originally introduced since it was produced somewhere in bootstrap and the compiler ICE'd on it. It's only relevant for MTE, since for HWASAN the ADDTAG pattern is converted into a plain ADD at expand time. Moreover, this pattern only works for MTE, since there is no define_insn matching ADDTAG patterns when MTE is not available. */ // // Apparently clang defines the __GNUC__ macro. // // That's a little annoying, but the below works for what I need. // #ifndef __clang__ // int __RTL (startwith ("final")) // bootstrap_problem () // { // (function "bootstrap_problem" // (insn-chain // (block 2 // (edge-from entry (flags "FALLTHRU")) // (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK) // (cinsn 90 (set (reg:DI x0) // (addtag:DI (reg:DI x0) // (const_int -16) // (const_int 1)))) // (edge-to exit (flags "FALLTHRU")) // ) ;; block 2 // ) ;; insn-chain // ) ;; function "bootstrap_problem" // } // #endif int __attribute__((noinline)) very_large_offset (int *p) { char init_array[(uint64_t)0xfefefef]; char other_array[(uint64_t)0xfefefef]; return (int)init_array[p[1]] + (int)other_array[p[0]]; } void __attribute__((noinline)) just_abort() { abort(); } /* This test is only around for observing what the compiler does, not for checking the HWASAN implementation works properly. HWASAN and MTE do nothing special for varargs. */ void __attribute__((noinline)) varargs_function (size_t numargs, ...) { va_list args; va_start (args, numargs); while (numargs > 0) { char *x = va_arg (args, char *); printf ("Address of next argument is: %p\n", &x); numargs -= 1; } va_end (args); } /* Functions to observe that HWASAN instruments memory builtins in the expected manner. */ void * __attribute__((noinline)) memset_builtin (void *dest, int value, size_t len) { return __builtin_memset (dest, value, len); } /* HWASAN avoids strlen because it doesn't know the size of the memory access until *after* the function call. */ size_t __attribute__ ((noinline)) strlen_builtin (char *element) { return __builtin_strlen (element); } /* Here to check that the ASAN_POISON stuff works just fine. THis mechanism isn't very often used, but I should at least go through the code-path once in my testfile. */ int __attribute__ ((noinline)) access_outside_of_scope () { int *ptr = 0; { int a; ptr = &a; *ptr = 12345; } return *ptr; } int main (int argc, char *argv[]) { printf("outside-scope return value: %d\n", access_outside_of_scope ()); printf("Returned value of %d\n", uses_setjmp (argc)); printf ("Value is %d\n", handle_large_alignment (argc)); printf ("Value is %d\n", check_alloca (argc)); using_alloca (argc); int left[100] = {0}; memset_builtin (left, 1, 100); char *mystring = "hello world!\n"; printf ("Length of string is: %u\n", strlen_builtin (mystring)); varargs_function (4, "hello there", (char)1, (char)2); int right[100] = {0}; accessing_pointers (left, right); using_stack(argc); int num = argc; two_items_on_stack(num); int direction = argc; stack_object_use(direction); int *num_addr = # set_value_to_random(num_addr); struct big_struct x; x = return_on_stack(); struct big_struct y; y = unnamed_return_on_stack(); printf("Value is %d ... %d\n", x.left, y.left); return 0; } /* {{{ Simply checking that HWASAN caught failures behave sensibly when run under a multithreaded environment. */ // #include <pthread.h> // #include <stdio.h> // #include <string.h> // #include <stdarg.h> // #include <stdbool.h> // #include <stdint.h> // void * // failing_thread_function (void *argument) // { // void * other = (void *)((uint64_t)argument & 0xffffffffffffffULL); // int *num = argument; // printf ("(should succeed): first number = %d\n", num[0]); // printf ("(now should fail):"); // fflush (stdout); // int *othernum = other; // printf (" second number = %d\n", othernum[0]); // return (void *)1; // } // void * // failing_from_stack (void * argument) // { // int internal_array[16] = {0}; // printf ("(now should fail):"); // fflush (stdout); // printf (" problem number is %d\n", internal_array[17]); // return (void *)1; // } // void * // pthread_stack_is_cleared (void *argument) // { // (void)argument; // int internal_array[16] = {0}; // return (void*)internal_array; // } // void * // successful_thread_function (void * argument) // { // int *deref = (int *)argument; // printf ("(should be fine): sum of first two numbers is %d\n", // deref[0] + deref[1]); // return (void *)0; // } // int // main (int argc, char **argv) // { // int argument[100] = {0}; // argument[1] = 10; // void *(*thread_function) (void *) = NULL; // bool access_variable = false; // if (argc < 2 || strcmp (argv[1], "success") == 0) // thread_function = successful_thread_function; // else if (strcmp (argv[1], "external-fail") == 0) // thread_function = failing_thread_function; // else if (strcmp (argv[1], "internal-fail") == 0) // thread_function = failing_from_stack; // else // { // thread_function = pthread_stack_is_cleared; // access_variable = true; // } // pthread_t thread_index; // pthread_create (&thread_index, NULL, thread_function, (void*)argument); // void *retval; // pthread_join (thread_index, &retval); // if (access_variable) // { // printf ("(should fail): "); // fflush (stdout); // printf ("value left in stack is: %d\n", ((int *)retval)[0]); // } // return (int)retval; // } /* }}} */