Re: [gem5-users] Issues with X86 FS O3 classic memory

Steve Reinhardt via gem5-users Sun, 29 Jun 2014 14:33:25 -0700

It would be great to make this work.  The key issue is that x86
synchronization is different from ARM & Alpha. The latter rely on
load-link/store-conditional, but x86 relies on the ability to do locked RMW
transactions that are guaranteed atomic.  This is signaled to the cache
using the LOCKED flag (see src/mem/request.hh, esp. the comment there on
the LOCKED bit).


In the SimpleAtomic model, the CPU guarantees atomicity itself: when it
sees a load with the LOCKED flag set, it keeps processing instructions
(within the same atomic tick() method) until it sees the corresponding
store that ends the RMW operation.

With timing mode accesses, this approach obviously doesn't work.  What the
Ruby cache does is have a variable that records the address (if any) on
which a locked RMW is being performed, and it defers processing any
incoming message that targets that address.  This works OK for Ruby since
it has a built-in notion of deferring messages and retrying them later.  In
retrospect, it's not clear that this approach is adequate; in x86, it is
possible to issue locked requests to unaligned locations that cross
cache-line boundaries, so Ruby probably should have the ability to lock two
adjacent cache lines.  It's not surprising that, in practice, only locking
one line works OK, especially for modern code.

In classic mode, I would suggest adding a bit to each cache block that
marks it as locked (set when a LOCKED load is issued, cleared when the
corresponding LOCKED store is seen); then in recvTimingSnoopReq() or
handleSnoop() we should check that bit and hold off on doing the snoop
until the locked RMW access is complete.  This is similar to what happens
when we receive a snoop for a block that we know we will be receiving
(because the snoop on our request has completed) but have not received the
data for yet.  The differences are that in this case (1) we won't already
have an mshr allocated, so we'll probably need to allocate one, and (2) we
won't have the data arrival to trigger the processing of the deferred
snoop, so instead we'll need to process it when the LOCKED store comes in
to end the RMW access.

Hope that helps... it would be very much appreciated if you could make this
work and contribute your patch back.  Let me know if you have any questions.

Steve



On Sun, Jun 29, 2014 at 12:18 PM, Ivan Stalev <ids...@psu.edu> wrote:

> To correct myself, I ran a quick test and actually single-core does work
> with this configuration, but the kernel crash happens with 2+ cores. I
> prefer not to use Ruby due to the much slower simulation speed. I believe
> this configuration (FS, multi-core, O3, classic memory) works with both ARM
> and ALPHA; would it take significant effort to make it work for x86 as well?
>
> Thanks,
>
> Ivan
>
>
> On Sun, Jun 29, 2014 at 1:47 AM, Steve Reinhardt <ste...@gmail.com> wrote:
>
>> x86 multi-core with O3 and the classic memory system doesn't work, as the
>> classic caches don't have support for x86 locked accesses.  In contrast,
>> x86 multi-core works with O3 and Ruby, since Ruby does support
>> locked accesses; and it also works with the AtomicSimple CPU model and
>> classic memory, since in that case the locked accesses are handled by the
>> AtomicSimple CPU model.
>>
>> x86 single-core typically does work with the classic memory system.  As
>> I'm writing this, I'm wondering whether it's possible to have a problem
>> with taking an interrupt in the middle of a locked RMW on a single core
>> system... I would think not, since the interrupt should only be taken at a
>> macro-instruction boundary, but there could be something subtle like that.
>>
>> Or maybe it's not related to locked accesses at all; I just bring that up
>> since it's one thing that's known not to work in your configuration.
>>
>> Steve
>>
>>
>>
>> On Sat, Jun 28, 2014 at 8:07 PM, Ivan Stalev <ids...@psu.edu> wrote:
>>
>>> I am simulating a multi-core system, but the issue also occurs with
>>> single-core as well. The error message comes from the kernel. Here is one
>>> of them below:
>>>
>>> Thanks,
>>>
>>> Ivan
>>>
>>> Bad page state in process 'spec.astar_base'
>>> page:ffffffff807205e8 flags:0x0000000000000000 mapping:000000baffffffff
>>> mapcount:1 count:0
>>> Trying to fix it up, but a reboot is needed
>>> Backtrace:
>>>
>>> Call Trace:
>>>  [<ffffffff8025d36d>] bad_page+0x5d/0x90
>>>  [<ffffffff8025dddb>] get_page_from_freelist+0x42b/0x440
>>>  [<ffffffff8025de81>] __alloc_pages+0x91/0x350
>>>  [<ffffffff80270a24>] anon_vma_prepare+0x24/0x110
>>>  [<ffffffff802681a3>] __handle_mm_fault+0x9d3/0xcd0
>>>  [<ffffffff8026c4ce>] vma_adjust+0x13e/0x500
>>>  [<ffffffff805be88f>] do_page_fault+0x1af/0x900
>>>  [<ffffffff8026cf04>] vma_merge+0x1c4/0x2a0
>>>  [<ffffffff805b9c84>] schedule+0x134/0x35a
>>>  [<ffffffff8026d32a>] do_brk+0x1aa/0x380
>>>  [<ffffffff805bcbbd>] error_exit+0x0/0x84
>>>
>>> Unable to handle kernel paging request at 49485c48fc01b000 RIP:
>>>  [<ffffffff8037b1d2>] clear_page+0x12/0x40
>>> PGD 0
>>> Oops: 0002 [1] SMP
>>> CPU 2
>>> Modules linked in:
>>> Pid: 849, comm: spec.astar_base Tainted: G    B  2.6.22 #1
>>> RIP: 0010:[<ffffffff8037b1d2>]  [<ffffffff8037b1d2>] clear_page+0x12/0x40
>>> RSP: 0000:ffff81013edc3c50  EFLAGS: 0000023c
>>>  RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000003f
>>> RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 49485c48fc01b000
>>> RBP: 0000000000000001 R08: 0000000000000005 R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff807205e8
>>> R13: ffffffff807205e8 R14: ffff810000000000 R15: 6db6db6db6db6db7
>>> FS:  00000000006b5860(0063) GS:ffff81013fc65cc0(0000)
>>> knlGS:0000000000000000
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>> CR2: 49485c48fc01b000 CR3: 000000013ed27000 CR4: 00000000000006e0
>>> Process spec.astar_base (pid: 849, threadinfo ffff81013edc2000, task
>>> ffff81013f294ae0)
>>> Stack:  ffffffff8025dc26 ffffffff80720d58 0000000000000000
>>> 000000000000026c
>>>  0000000000000002 ffffffff807dd820 00000000000280d2 0000000000000001
>>>  ffffffff80720a00 ffffffff80720d58 0000000100000000 0000000000000044
>>> Call Trace:
>>>  [<ffffffff8025dc26>] get_page_from_freelist+0x276/0x440
>>>  [<ffffffff8025de81>] __alloc_pages+0x91/0x350
>>>  [<ffffffff80270a24>] anon_vma_prepare+0x24/0x110
>>>  [<ffffffff802681a3>] __handle_mm_fault+0x9d3/0xcd0
>>>  [<ffffffff8026c4ce>] vma_adjust+0x13e/0x500
>>>  [<ffffffff805be88f>] do_page_fault+0x1af/0x900
>>>  [<ffffffff8026cf04>] vma_merge+0x1c4/0x2a0
>>>  [<ffffffff805b9c84>] schedule+0x134/0x35a
>>>  [<ffffffff8026d32a>] do_brk+0x1aa/0x380
>>>  [<ffffffff805bcbbd>] error_exit+0x0/0x84
>>>
>>>
>>> Code: 48 89 07 48 89 47 08 48 89 47 10 48 89 47 18 48 89 47 20 48
>>> RIP  [<ffffffff8037b1d2>] clear_page+0x12/0x40
>>>  RSP <ffff81013edc3c50>
>>> CR2: 49485c48fc01b000
>>> note: spec.astar_base[849] exited with preempt_count 1
>>>
>>>
>>> On Sat, Jun 28, 2014 at 10:34 PM, Steve Reinhardt <ste...@gmail.com>
>>> wrote:
>>>
>>>> Sorry, we primarily use SE mode, so we don't have this problem.  Is
>>>> this for a single-core system?  Is the error message you see from the
>>>> kernel or from gem5?
>>>>
>>>> Steve
>>>>
>>>>
>>>> On Sat, Jun 28, 2014 at 6:51 PM, Ivan Stalev via gem5-users <
>>>> gem5-users@gem5.org> wrote:
>>>>
>>>>> Is anyone successfully running SPEC2006 benchmarks using x86
>>>>> full-system with detailed CPUs and the classic memory model? I am able to
>>>>> run simple C programs, but when I run any SPEC benchmark, I get a "Bad 
>>>>> page
>>>>> state in process" error for each SPEC benchmark and then the kernel
>>>>> crashes. I've tried a few different kernels and disk images, including the
>>>>> ones on the GEM5 website and building my own. I've also tried different
>>>>> GEM5 versions. Everything runs fine with atomic CPUs. A few of my labmates
>>>>> are also experiencing the same issue.
>>>>>
>>>>> Can you please share if you have been successful with this setup?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Ivan
>>>>>
>>>>> _______________________________________________
>>>>> gem5-users mailing list
>>>>> gem5-users@gem5.org
>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>>>
>>>>
>>>>
>>>
>>
>

_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] Issues with X86 FS O3 classic memory

Reply via email to