Re: [gem5-users] What is the latency of a page table walk in SimpleTimingCPU?

Ali Saidi Thu, 09 Feb 2012 09:45:10 -0800

 

It's all not necessarily roses there either as you'll need to get
the CPU model to call translate 3 times, but that is probably more
contained and you might be able to leverage the WholeTranslation code
(see src/cpu/translation.hh) which is normally used for requests that
cross a page boundary. If you end up taking a fault that the cpu needs
to handle (e.g. on page that has been malloced, but not actually
allocated by the kernel) it's still going to be trouble. However, you
can probably work around this with your benchmark.


Ali 

On 09.02.2012
11:27, Paul Rosenfeld wrote: 

> Well that doesn't sound like fun.
Perhaps I'll look at ARM as a potential target. 
> 
> On Thu, Feb 9,
2012 at 11:39 AM, Ali Saidi <sa...@umich.edu [13]> wrote:
> 
>> It's
possible with Alpha, but it would take some work. You'd need to "take" a
fault up to times and in between each time fix-up the fault status
registers to have consistant data. Keeping track of what needs to be in
the register as any one time sounds difficult, especially as translation
faults can nest (you take a fault on the page table that you need te
look at). An architecture with a hardware table walker is probably a bit
easier to deal with. 
>> 
>> Ali 
>> 
>> On 09.02.2012 10:09, Paul
Rosenfeld wrote: 
>> 
>>> Thanks for the replies. I'm still trying to
find my way around M5 and I thought the SE/TimingSimpleCPU would be a
good way to see what's involved in modifying M5. 
>>> One thing that I'm
worried about is that for this work, I have multiple memory operands in
registers that need to be translated to physical addresses. In the SE
mode, I've simply added a new fake Fault where it will translate all 3
addresses. However, I'm not sure if a similar approach would be possible
in FS mode (I haven't looked at how any of the PAL stuff works). Do you
think it would be more feasible to generate multiple single TLB faults
per operand in the instruction, or to do something where they all get
translated together using a new fault? 
>>> 
>>> On Thu, Feb 9, 2012 at
10:16 AM, Ali Saidi <sa...@umich.edu [12]> wrote:
>>> 
>>>> Hi Paul,

>>>> 
>>>> Yes, in SE mode, it's just faked as a pipeline flush (in the
simple CPU model then pretty much nothing happens). It should be
reasonably easy to change the model to delay some number of ns on a TLB
miss, but you'll get the best results by running in fs mode. 
>>>> 
>>>>
Ali 
>>>> 
>>>> On 09.02.2012 01:05, Paul Rosenfeld wrote: 
>>>> 
>>>>>
So do you think that my reasoning that the TLB miss penalty is simply a
single cycle re-fetch penalty on the faulting instruction is correct for
ALPHA_SE/TimingSimpleCPU? 
>>>>> 
>>>>> On Thu, Feb 9, 2012 at 12:31 AM,
Gabriel Michael Black <gbl...@eecs.umich.edu [9]> wrote:
>>>>> 
>>>>>> I
believe that's correct. 
>>>>>> 
>>>>>> Gabe
>>>>>> 
>>>>>> Quoting Paul
Rosenfeld <dramnin...@gmail.com [6]>:
>>>>>> 
>>>>>>> I guess I forgot
to mention in my original email that I was talking about
>>>>>>>
alpha.... I think in FS it will vector into a PAL routine, but in SE
it
>>>>>>> looks like it's all just faked ...
>>>>>>> 
>>>>>>> On Wed,
Feb 8, 2012 at 11:08 PM, Gabriel Michael Black <
>>>>>>>
gbl...@eecs.umich.edu [5]> wrote:
>>>>>>> 
>>>>>>>> There are two types
of mechanisms to handle TLB misses, in hardware or in
>>>>>>>> software.
If the ISA you're using does it in software, there's a fault
>>>>>>>>
which makes the OS handle the miss. In that case it will take however
long
>>>>>>>> it takes the OS to get things set up again. If the miss is
handled in
>>>>>>>> hardware, then there's a TLB walker component which
does memory accesses to
>>>>>>>> look up the entry in the page tables,
and the delay is determined by those
>>>>>>>> accesses.
>>>>>>>>

>>>>>>>> Gabe
>>>>>>>> 
>>>>>>>> Quoting Paul Rosenfeld
<dramnin...@gmail.com [1]>:
>>>>>>>> 
>>>>>>>> Hello all,
>>>>>>>>

>>>>>>>>> I'm trying to modify the TLB code for SimpleTimingCPU, but
one thing I
>>>>>>>>> can't seem to find is what the latency of a DTLB
miss is. I found the code
>>>>>>>>> in NDtbMissFault->invoke() for
reading the page table mapping, but I can't
>>>>>>>>> seem to figure out
if there's any mechanism for stalling the CPU to handle
>>>>>>>>> the
fault.
>>>>>>>>> 
>>>>>>>>> Reading the wiki for the SImpleTimingCPU, it
sounds like it isn't meant to
>>>>>>>>> model this kind of detail. So is
it just a one cycle fetch penalty for
>>>>>>>>> handling a TLB
miss?
>>>>>>>>> 
>>>>>>>>> If this is the case, what's the simplest CPU
model that will actually
>>>>>>>>> stall
>>>>>>>>> for TLB
misses?
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Paul
>>>>>>>>
______________________________**_________________
>>>>>>>> gem5-users
mailing list
>>>>>>>> gem5-users@gem5.org [2]
>>>>>>>>
http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users
[3]<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users [4]>
>>>>>>

>>>>>> _______________________________________________
>>>>>>
gem5-users mailing list
>>>>>> gem5-users@gem5.org [7]
>>>>>>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users [8]
>>>> 
>>>>
_______________________________________________
>>>> gem5-users mailing
list
>>>> gem5-users@gem5.org [10]
>>>>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users [11]




Links:
------
[1] mailto:dramnin...@gmail.com
[2]
mailto:gem5-users@gem5.org
[3]
http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users
[4]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[5]
mailto:gbl...@eecs.umich.edu
[6] mailto:dramnin...@gmail.com
[7]
mailto:gem5-users@gem5.org
[8]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[9]
mailto:gbl...@eecs.umich.edu
[10] mailto:gem5-users@gem5.org
[11]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[12]
mailto:sa...@umich.edu
[13] mailto:sa...@umich.edu

_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] What is the latency of a page table walk in SimpleTimingCPU?

Reply via email to