Thanks.

I will apply 1, and 2 patches.

For 3, I need to change the file src/arch/x86/isa/microops/specop.isa:66
from 
setFlags | (ULL(1) << StaticInst::IsNonSpeculative),
to
setFlags | (ULL(1) << StaticInst::IsNonSpeculative) | (ULL(1) << 
StaticInst::IsQuiesce),

Am I doing the right thing to tag "MicroHalt" instruction as "IsQuiesce"?

BTW, what I did to boot linux is to install gentoo inside QEMU,
then use x86KvmCPU to boot up, then take checkpoints and run from
checkpoints.

I will report whether this works or not.

Thanks.

--
Best Regards
Yan Zi

On 27 Aug 2014, at 15:44, Mitch Hayenga wrote:

> There are probably three main patches that could help.  The fact you
> mention the timer interrupt makes me think Andreas is right and these might
> solve your issue.
>
> 1. http://reviews.gem5.org/r/2363/  - o3 is supposed to stop fetching
> instructions immediately once a quiesce instruction is encountered, some
> managed to sneak by.  Quiesce is used for things like sleeping until an
> interrupt occurs, etc.  Without this patch, we experienced the case where
> o3 state would get corrupted and an instruction would sit at commit until
> the next timer interrupt happened.  At which point taking the interrupt
> would clear the state and execution would continue (until this same bug
> happened again).
>
> 2. http://reviews.gem5.org/r/2367/  - If o3 was being drained while an
> interrupt occurred on x86, it could deadlock.
>
> 3. I believe this last patch will be posted in a day or two.  x86 currently
> does not tag any instruction that suspends() the CPU as a "quiesce".  This
> is required by o3 to properly operate, but not by the Atomic CPU.  This
> makes the issue in #1 far more likely to occur.  It's pretty amazing that
> x86 booted linux at all on o3 without this.  I believe this patch will be
> posted shortly, but otherwise you could just tag the "MicroHalt"
> instruction as "IsQuiesce" yourself.
>
> So a combination of those things (mainly the last one) could lead to what
> you are seeing.
>
>
> On Wed, Aug 27, 2014 at 12:59 PM, Zi Yan via gem5-users <gem5-users@gem5.org
>> wrote:
>
>> OK. Could you please tell me which patches are there? In the
>> review board there are quite a lot of new patches waiting
>> for review.
>>
>> I can apply those patches myself and do a quick test.
>>
>> Thanks.
>>
>> --
>> Best Regards
>> Yan Zi
>>
>> On 27 Aug 2014, at 13:56, Andreas Hansson wrote:
>>
>>> Hi Yan,
>>>
>>> I would suspect this is due to a bug in the X86 O3 CPU. There have been
>>> quite a few fixes posted on the review board for similar issues. I hope
>> to
>>> have these committed in the next week or so.
>>>
>>> Andreas
>>>
>>>
>>> On 27/08/2014 18:02, "Zi Yan via gem5-users" <gem5-users@gem5.org>
>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I am running kmeans via hadoop in gem5 X86 FS mode. I am using
>>>> linux kernel 3.2.60 with configuration file linux-2.6.28.4 from
>>>> gem5.org.
>>>>
>>>> I take a checkpoint before a map task and put a "m5 exit" after the map
>>>> task.
>>>> I am using *X86kvmCPU* to take checkpoints.
>>>>
>>>> When I restore from the same checkpoint, atomic CPU and O3 CPU give me
>>>> quite different executed instructions:
>>>> 1) atomic CPU executes about 350 million instructions, reaches "m5
>> exit",
>>>> then stops simulation.
>>>> 2) O3 CPU executes more than 12 billion instructions, and still not
>>>> reaches
>>>> "m5 exit" to stop the simulation.
>>>>
>>>> I dump out committed PCs from atomic CPU and O3 CPU, finding out that
>>>> after about 500,000 instructions, the systems behave differently,
>>>> where atomic CPU is still executing user code, but O3 CPU switch to
>>>> apic_timer_interrupt(a kernel function, it also appears in atomic CPU
>>>> execution, but somewhere else).
>>>>
>>>> Could anyone please give some advice about why this happen?
>>>>
>>>> Thanks.
>>>>
>>>> --
>>>> Best Regards
>>>> Yan Zi
>>>
>>>
>>> -- IMPORTANT NOTICE: The contents of this email and any attachments are
>> confidential and may also be privileged. If you are not the intended
>> recipient, please notify the sender immediately and do not disclose the
>> contents to any other person, use it for any purpose, or store or copy the
>> information in any medium.  Thank you.
>>>
>>> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
>> Registered in England & Wales, Company No:  2557590
>>> ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1
>> 9NJ, Registered in England & Wales, Company No:  2548782
>>
>> _______________________________________________
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to