Thanks. I will apply 1, and 2 patches.
For 3, I need to change the file src/arch/x86/isa/microops/specop.isa:66 from setFlags | (ULL(1) << StaticInst::IsNonSpeculative), to setFlags | (ULL(1) << StaticInst::IsNonSpeculative) | (ULL(1) << StaticInst::IsQuiesce), Am I doing the right thing to tag "MicroHalt" instruction as "IsQuiesce"? BTW, what I did to boot linux is to install gentoo inside QEMU, then use x86KvmCPU to boot up, then take checkpoints and run from checkpoints. I will report whether this works or not. Thanks. -- Best Regards Yan Zi On 27 Aug 2014, at 15:44, Mitch Hayenga wrote: > There are probably three main patches that could help. The fact you > mention the timer interrupt makes me think Andreas is right and these might > solve your issue. > > 1. http://reviews.gem5.org/r/2363/ - o3 is supposed to stop fetching > instructions immediately once a quiesce instruction is encountered, some > managed to sneak by. Quiesce is used for things like sleeping until an > interrupt occurs, etc. Without this patch, we experienced the case where > o3 state would get corrupted and an instruction would sit at commit until > the next timer interrupt happened. At which point taking the interrupt > would clear the state and execution would continue (until this same bug > happened again). > > 2. http://reviews.gem5.org/r/2367/ - If o3 was being drained while an > interrupt occurred on x86, it could deadlock. > > 3. I believe this last patch will be posted in a day or two. x86 currently > does not tag any instruction that suspends() the CPU as a "quiesce". This > is required by o3 to properly operate, but not by the Atomic CPU. This > makes the issue in #1 far more likely to occur. It's pretty amazing that > x86 booted linux at all on o3 without this. I believe this patch will be > posted shortly, but otherwise you could just tag the "MicroHalt" > instruction as "IsQuiesce" yourself. > > So a combination of those things (mainly the last one) could lead to what > you are seeing. > > > On Wed, Aug 27, 2014 at 12:59 PM, Zi Yan via gem5-users <gem5-users@gem5.org >> wrote: > >> OK. Could you please tell me which patches are there? In the >> review board there are quite a lot of new patches waiting >> for review. >> >> I can apply those patches myself and do a quick test. >> >> Thanks. >> >> -- >> Best Regards >> Yan Zi >> >> On 27 Aug 2014, at 13:56, Andreas Hansson wrote: >> >>> Hi Yan, >>> >>> I would suspect this is due to a bug in the X86 O3 CPU. There have been >>> quite a few fixes posted on the review board for similar issues. I hope >> to >>> have these committed in the next week or so. >>> >>> Andreas >>> >>> >>> On 27/08/2014 18:02, "Zi Yan via gem5-users" <gem5-users@gem5.org> >> wrote: >>> >>>> Hi all, >>>> >>>> I am running kmeans via hadoop in gem5 X86 FS mode. I am using >>>> linux kernel 3.2.60 with configuration file linux-2.6.28.4 from >>>> gem5.org. >>>> >>>> I take a checkpoint before a map task and put a "m5 exit" after the map >>>> task. >>>> I am using *X86kvmCPU* to take checkpoints. >>>> >>>> When I restore from the same checkpoint, atomic CPU and O3 CPU give me >>>> quite different executed instructions: >>>> 1) atomic CPU executes about 350 million instructions, reaches "m5 >> exit", >>>> then stops simulation. >>>> 2) O3 CPU executes more than 12 billion instructions, and still not >>>> reaches >>>> "m5 exit" to stop the simulation. >>>> >>>> I dump out committed PCs from atomic CPU and O3 CPU, finding out that >>>> after about 500,000 instructions, the systems behave differently, >>>> where atomic CPU is still executing user code, but O3 CPU switch to >>>> apic_timer_interrupt(a kernel function, it also appears in atomic CPU >>>> execution, but somewhere else). >>>> >>>> Could anyone please give some advice about why this happen? >>>> >>>> Thanks. >>>> >>>> -- >>>> Best Regards >>>> Yan Zi >>> >>> >>> -- IMPORTANT NOTICE: The contents of this email and any attachments are >> confidential and may also be privileged. If you are not the intended >> recipient, please notify the sender immediately and do not disclose the >> contents to any other person, use it for any purpose, or store or copy the >> information in any medium. Thank you. >>> >>> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, >> Registered in England & Wales, Company No: 2557590 >>> ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 >> 9NJ, Registered in England & Wales, Company No: 2548782 >> >> _______________________________________________ >> gem5-users mailing list >> gem5-users@gem5.org >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>
signature.asc
Description: OpenPGP digital signature
_______________________________________________ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users