On 11/12/19 4:21 pm, Daniel Axtens wrote:
> Hi Balbir,
> 
>>> +Discontiguous memory can occur when you have a machine with memory spread
>>> +across multiple nodes. For example, on a Talos II with 64GB of RAM:
>>> +
>>> + - 32GB runs from 0x0 to 0x0000_0008_0000_0000,
>>> + - then there's a gap,
>>> + - then the final 32GB runs from 0x0000_2000_0000_0000 to 
>>> 0x0000_2008_0000_0000
>>> +
>>> +This can create _significant_ issues:
>>> +
>>> + - If we try to treat the machine as having 64GB of _contiguous_ RAM, we 
>>> would
>>> +   assume that ran from 0x0 to 0x0000_0010_0000_0000. We'd then reserve the
>>> +   last 1/8th - 0x0000_000e_0000_0000 to 0x0000_0010_0000_0000 as the 
>>> shadow
>>> +   region. But when we try to access any of that, we'll try to access pages
>>> +   that are not physically present.
>>> +
>>
>> If we reserved memory for KASAN from each node (discontig region), we might 
>> survive
>> this no? May be we need NUMA aware KASAN? That might be a generic change, 
>> just thinking
>> out loud.
> 
> The challenge is that - AIUI - in inline instrumentation, the compiler
> doesn't generate calls to things like __asan_loadN and
> __asan_storeN. Instead it uses -fasan-shadow-offset to compute the
> checks, and only calls the __asan_report* family of functions if it
> detects an issue. This also matches what I can observe with objdump
> across outline and inline instrumentation settings.
> 
> This means that for this sort of thing to work we would need to either
> drop back to out-of-line calls, or teach the compiler how to use a
> nonlinear, NUMA aware mem-to-shadow mapping.

Yes, out of line is expensive, but seems to work well for all use cases.
BTW, the current set of patches just hang if I try to make the default
mode as out of line


> 
> I'll document this a bit better in the next spin.
> 
>>> +   if (IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_PPC_BOOK3S_64)) {
>>> +           kasan_memory_size =
>>> +                   ((phys_addr_t)CONFIG_PHYS_MEM_SIZE_FOR_KASAN << 20);
>>> +
>>> +           if (top_phys_addr < kasan_memory_size) {
>>> +                   /*
>>> +                    * We are doomed. Attempts to call e.g. panic() are
>>> +                    * likely to fail because they call out into
>>> +                    * instrumented code, which will almost certainly
>>> +                    * access memory beyond the end of physical
>>> +                    * memory. Hang here so that at least the NIP points
>>> +                    * somewhere that will help you debug it if you look at
>>> +                    * it in qemu.
>>> +                    */
>>> +                   while (true)
>>> +                           ;
>>
>> Again with the right hooks in check_memory_region_inline() these are 
>> recoverable,
>> or so I think
> 
> So unless I misunderstand the circumstances in which
> check_memory_region_inline is used, this isn't going to help with inline
> instrumentation.
> 

Yes, I understand. Same as above?


>>> +void __init kasan_init(void)
>>> +{
>>> +   int i;
>>> +   void *k_start = kasan_mem_to_shadow((void *)RADIX_KERN_VIRT_START);
>>> +   void *k_end = kasan_mem_to_shadow((void *)RADIX_VMEMMAP_END);
>>> +
>>> +   pte_t pte = __pte(__pa(kasan_early_shadow_page) |
>>> +                     pgprot_val(PAGE_KERNEL) | _PAGE_PTE);
>>> +
>>> +   if (!early_radix_enabled())
>>> +           panic("KASAN requires radix!");
>>> +
>>
>> I think this is avoidable, we could use a static key for disabling kasan in
>> the generic code. I wonder what happens if someone tries to boot this
>> image on a Power8 box and keeps panic'ing with no easy way of recovering.
> 
> Again, assuming I understand correctly that the compiler generates raw
> IR->asm for these checks rather than calling out to a function, then I
> don't think we get a way to intercept those checks. It's too late to do
> anything at the __asan report stage because that will already have
> accessed memory that's not set up properly.
> 
> If you try to boot this on a Power8 box it will panic and you'll have to
> boot into another kernel from the bootloader. I don't think it's
> avoidable without disabling inline instrumentation, but I'd love to be
> proven wrong.
> 
>>
>> NOTE: I can't test any of these, well may be with qemu, let me see if I can 
>> spin
>> the series and provide more feedback
> 
> It's actually super easy to do simple boot tests with qemu, it works fine in 
> TCG,
> Michael's wiki page at
> https://github.com/linuxppc/wiki/wiki/Booting-with-Qemu is very helpful.
> 
> I did this a lot in development.
> 
> My full commandline, fwiw, is:
> 
> qemu-system-ppc64  -m 8G -M pseries -cpu power9  -kernel 
> ../out-3s-radix/vmlinux  -nographic -chardev stdio,id=charserial0,mux=on 
> -device spapr-vty,chardev=charserial0,reg=0x30000000 -initrd 
> ./rootfs-le.cpio.xz -mon chardev=charserial0,mode=readline -nodefaults -smp 4

qemu has been crashing with KASAN enabled/ both inline/out-of-line options. I 
am running linux-next + the 4 patches you've posted. In one case I get a panic 
and a hang in the other. I can confirm that when I disable KASAN, the issue 
disappears

Balbir Singh.

> 
> Regards,
> Daniel
> 

Reply via email to