Hi Mahshid, Something you could do is to leverage the bash shell to get the same checkpoint restore functionality as you should with the hack_back_ckpt.rcS script. For instance, if you execute the following at the terminal, the checkpoint that you will collect should function similarly to the hack_back_ckpt.rcS script:
% /sbin/m5 checkpoint; echo "Loading new script..."; /sbin/m5 readfile > /tmp/runscript; chmod 755 /tmp/runscript; if [ -s /tmp/runscript ]; then exec /tmp/runscript; else echo "Script not specified. Dropping back to shell..."; fi Also, I suspect the trouble you're seeing with the large-scale CMP checkpointing may have to do with a recently identified draining bug in the SimpleMemory. I'll cc you on that thread. Joel On Wed, May 29, 2013 at 3:07 PM, Mahshid Sedghi <mahshid.sed...@gmail.com>wrote: > Thanks a lot for your detailed and helpful answer Joel. Now everything is > clear to me. I was able to use this script successfully for small systems. > However, whenever I take checkpoint for a 64-core system using this script, > the restoration fails since the draining cannot complete. This is strange, > since I can successfully restore from a checkpoint (for a similar system) > which was taken by attaching a terminal. These two different ways are > essentially doing the same, but for some reason the outcome is different. > I have tried so many times, and it is never able to restore. I have to > debug if further to figure out what the problem is. > > Thanks Joel, > Mahshid > > > On Wed, May 29, 2013 at 2:31 PM, Joel Hestness <jthestn...@gmail.com>wrote: > >> Hi Mahshid, >> The hack_back_ckpt.rcS script was specifically designed to give the >> functionality of being able to take a checkpoint after Linux boot and then >> restore with a different script. It should work with any ISA and either >> the classic or Ruby memory models. In order to get "hacky" functionality, >> you need to take a checkpoint using this script. If you'd like more detail >> on how this works, I'd recommend that you read the comments in >> hack_back_ckpt.rcS. >> >> Regardless of whether you are using hack_back_ckpt.rcS, when you take a >> checkpoint and restore from it, the simulated system returns exactly to >> whatever it was doing when you took the checkpoint. Hence, if you take a >> checkpoint by attaching a terminal and calling the m5 binary to take a >> checkpoint, when you restore from that checkpoint, the system will finish >> running the m5 binary and drop back to the terminal, as you would expect if >> it hadn't taken a checkpoint. If you annotate a benchmark to take a >> checkpoint during simulation, when you restore from the checkpoint, the >> benchmark will continue executing where it left off to take the checkpoint. >> >> To your specific question: >> >> I mean if I take checkpoint without >> hack_back_ckpt.rcs<http://grok.gem5.org/xref/gem5/configs/boot/hack_back_ckpt.rcS> >> , >>> then while restoration "--script" does not work, and the simulated systems >>> boots to shell and won't run the benchmark. Although in ruby_fs, script is >>> passed to system.readfile, it is not running the script. Can you tell me >>> how to get this to work? >>> >> >> You're right that the "--script" command "does not work." This is >> because the simulated system must be prompted to read the script that you >> specify before executing it. Again, you can see this in the >> hack_back_ckpt.rcS. Specifically, the following terminal commands use some >> simulator magic to read the script, make it executable, and then run it: >> >> /sbin/m5 readfile > /tmp/runscript >> chmod 755 /tmp/runscript >> exec /tmp/runscript >> >> Hope this helps, >> Joel >> >> >> >> >>> On Sat, May 25, 2013 at 4:04 PM, Joel Hestness <jthestn...@gmail.com>wrote: >>> >>>> Hi Mahshid, >>>> You can simply specify the script that you would like to run to the >>>> --script= parameter of the fs.py or ruby_fs.py config file when you restore >>>> from the checkpoint. >>>> >>>> Joel >>>> >>>> >>>> On Thu, May 23, 2013 at 8:26 PM, Mahshid Sedghi < >>>> mahshid.sed...@gmail.com> wrote: >>>> >>>>> Hello, >>>>> >>>>> I am trying to restore from checkpoint and run my >>>>> benchmark immediately without entering the commands in the simulated >>>>> system >>>>> myself. I came across this thread and realized that >>>>> hack_back_ckpt.rcs<http://grok.gem5.org/xref/gem5/configs/boot/hack_back_ckpt.rcS> >>>>> is >>>>> supposedly doing what I need. I was able to take checkpoints automatically >>>>> using this script, but I'm not sure how to direct it to run my benchmark >>>>> afterwards. Can anybody give me a clue on that? >>>>> >>>>> Thanks. >>>>> >>>>> >>>>> On Sat, Sep 29, 2012 at 2:24 PM, Jun Pang <pang...@cs.duke.edu> wrote: >>>>> >>>>>> Hi Joel, >>>>>> >>>>>> Thank you very much. I really appreciate your help. I need to think >>>>>> about this. >>>>>> >>>>>> Best, >>>>>> Jun >>>>>> >>>>>> >>>>>> On Fri, Sep 28, 2012 at 8:22 PM, Joel Hestness >>>>>> <jthestn...@gmail.com>wrote: >>>>>> >>>>>>> Hi Jun, >>>>>>> >>>>>>> 1. Recompiled kernel x86_64-vmlinux-2.6.28.4.smp with SMP enabled >>>>>>>> and CONFIG_NR_CPUS=512. >>>>>>>> >>>>>>> >>>>>>> Ok. If my memory serves me correctly, you will need to also change >>>>>>> the CONFIG_NR_CPUS range in the kconfigs for kernel 2.6.28.4. I don't >>>>>>> recall the details on where to find these files, but the kconfigs >>>>>>> specify >>>>>>> which variables a config file can contain and the appropriate ranges >>>>>>> within >>>>>>> which the variables can be set. I believe the upper limit in the >>>>>>> kconfigs >>>>>>> for 2.6.28.4 was CONFIG_NR_CPUS=32. If this is the case the system may >>>>>>> look like it is still booting more than 32 cores, but Linux will only >>>>>>> show >>>>>>> 32 of them as available after you've booted the system. >>>>>>> >>>>>>> >>>>>>> >>>>>>>> 2.Made a checkpoint right before "Loading script..." and I guess >>>>>>>> this is similar to what /configs/boot/hack_back_ckpt.rcS does. I >>>>>>>> modified >>>>>>>> the file system file: etc/init.d/rsC to create the checkpoint. >>>>>>>> >>>>>>> >>>>>>> Cool, that works. The /configs/boot/hack_back_ckpt.rcS script will >>>>>>> also work with any disk image and any ISA for which gem5 has >>>>>>> checkpointing >>>>>>> capability. In the future, it might be easier to use than modifying the >>>>>>> disk image to collect checkpoints. >>>>>>> >>>>>>> >>>>>>> 3.The interrupt devices do give me problems. >>>>>>>> - Local and IO APIC have only 8 bits which means they will >>>>>>>> support up to 256 cores, but the IO_APIC.id is set to be the numCPUs. >>>>>>>> This >>>>>>>> means I cannot really use 256, because it is beyond 8 bits range. I >>>>>>>> tried >>>>>>>> to make IO_APIC.id =numCPU-1 and boot 256 cores, but then I got error: >>>>>>>> "panic: >>>>>>>> Legacy mode interrupts with error codes aren't implementde" when >>>>>>>> the system boots the 256th core. However, it works correctly with 1 or >>>>>>>> 2 >>>>>>>> cores. >>>>>>>> >>>>>>> >>>>>>> You're correct that the IO APIC reserves a bit in the ID mask so >>>>>>> that only 7 bits are available (i.e. up to 128 cores). I'm not sure >>>>>>> what >>>>>>> it would take to remove this restriction, though I believe we >>>>>>> investigated >>>>>>> using the extra bit and decided it was going to be difficult to make it >>>>>>> work. If you really need to use 256 cores, I would recommend starting >>>>>>> by >>>>>>> reading up on the Intel i8042, i82094aa, i8237, i8254, and i8259 >>>>>>> devices, >>>>>>> and also trying to identify the new devices that have replaced these >>>>>>> (which >>>>>>> may be capable of booting more than 128 cores in a single system). >>>>>>> >>>>>>> >>>>>>> -My ideal GPU clock rate is 5GHz, but I will get an error with >>>>>>>> that frequency:"MP-BIOS bug: 8254 timer not connected to IO-APIC". >>>>>>>> I have searched around and it seems to be an known issue with some >>>>>>>> kernels. >>>>>>>> The easiest way to fix it is to disable APIC which is not allowed in >>>>>>>> SMP I >>>>>>>> guess. So I reduced the frequency to be 4GHz.... >>>>>>>> >>>>>>> >>>>>>> I haven't run into this problem before, so I'm afraid I can't help >>>>>>> here. >>>>>>> >>>>>>> >>>>>>> Please let me know if what I have done so far makes sense or not. >>>>>>>> Any suggestion will be greatly appreciated. I am still reading the >>>>>>>> code and >>>>>>>> trying to understand the APIC part better. >>>>>>>> >>>>>>> >>>>>>> Sounds good so far. Let me know if there's anything else I can help >>>>>>> with, >>>>>>> >>>>>>> Joel >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On Thu, Sep 27, 2012 at 3:14 PM, Joel Hestness < >>>>>>>> jthestn...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi Jun, >>>>>>>>> Can you let us know which kernel version you're working with or >>>>>>>>> where you got the kernel that you're trying to boot? >>>>>>>>> >>>>>>>>> There are instructions here (http://gem5.org/Linux_kernel) for >>>>>>>>> building Linux for x86 and gem5. Currently, there aren't any known >>>>>>>>> ways to >>>>>>>>> speed up Linux boot, since the kernel waits for interrupts to >>>>>>>>> calibrate >>>>>>>>> delay loops already. >>>>>>>>> >>>>>>>>> Note also that there are probably some limitations to building >>>>>>>>> kernels to support more than 64 or 256 cores. Sometimes the kernel >>>>>>>>> kconfigs set default ranges for things like core count that you cannot >>>>>>>>> tweak just by changing the config files. Further, the interrupt >>>>>>>>> devices >>>>>>>>> currently in gem5 (./src/dev/x86/*) probably have a core count >>>>>>>>> limitation, >>>>>>>>> since they are implemented to function the same as devices used in >>>>>>>>> current >>>>>>>>> real hardware. You might want to check into these, because if you're >>>>>>>>> running into a core count limitation, something may be hung up in the >>>>>>>>> system you're booting with 255 cores. >>>>>>>>> >>>>>>>>> Overall, 3-5 hours to boot 64 cores doesn't sound unreasonable >>>>>>>>> with the atomic or timing CPUs. The out-of-order core will take >>>>>>>>> multiple >>>>>>>>> times longer to boot Linux because of the added core detail, so we do >>>>>>>>> not >>>>>>>>> suggest that you try this, but instead you take checkpoints after >>>>>>>>> Linux >>>>>>>>> boot and restore from the checkpoint into the benchmarks you'd like >>>>>>>>> to run. >>>>>>>>> (Check out the runscript ./configs/boot/hack_back_ckpt.rcS, which >>>>>>>>> makes >>>>>>>>> this checkpointing easy) >>>>>>>>> >>>>>>>>> Hope this helps, >>>>>>>>> Joel >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Sep 27, 2012 at 12:42 PM, Jun Pang <pang...@cs.duke.edu>wrote: >>>>>>>>> >>>>>>>>>> It's been over one day already, and my 255 cores system is still >>>>>>>>>> booting. Does anyone know the answer to speed up the booting? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Jun >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Sep 25, 2012 at 11:01 PM, Jun Pang >>>>>>>>>> <pang...@cs.duke.edu>wrote: >>>>>>>>>> >>>>>>>>>>> Hi guys, >>>>>>>>>>> >>>>>>>>>>> I want to boot more than 256 cores in X86 FS with ruby, however, >>>>>>>>>>> it's slow. More than 3 hours for ~60 cores so far without ruby... >>>>>>>>>>> >>>>>>>>>>> I read this from X86 session of >>>>>>>>>>> http://gem5.org/Architecture_Support : >>>>>>>>>>> >>>>>>>>>>> "patches are available for speeding up boot" >>>>>>>>>>> >>>>>>>>>>> I have searched around, but couldn't find any patches so far. >>>>>>>>>>> Does this refer to checkpoint and fast-forwarding? or there are >>>>>>>>>>> some real >>>>>>>>>>> patches I haven't found yet. If so, I really appreciate it if >>>>>>>>>>> someone could >>>>>>>>>>> point me to the right place to download the patches. >>>>>>>>>>> >>>>>>>>>>> Or are there any other techniques I can make the boot faster? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Jun >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> gem5-users mailing list >>>>>>>>>> gem5-users@gem5.org >>>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Joel Hestness >>>>>>>>> PhD Student, Computer Architecture >>>>>>>>> Dept. of Computer Science, University of Wisconsin - Madison >>>>>>>>> http://www.cs.utexas.edu/~hestness >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> gem5-users mailing list >>>>>>>>> gem5-users@gem5.org >>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> gem5-users mailing list >>>>>>>> gem5-users@gem5.org >>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Joel Hestness >>>>>>> PhD Student, Computer Architecture >>>>>>> Dept. of Computer Science, University of Wisconsin - Madison >>>>>>> http://www.cs.utexas.edu/~hestness >>>>>>> >>>>>>> _______________________________________________ >>>>>>> gem5-users mailing list >>>>>>> gem5-users@gem5.org >>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> gem5-users mailing list >>>>>> gem5-users@gem5.org >>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> gem5-users mailing list >>>>> gem5-users@gem5.org >>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>> >>>> >>>> >>>> >>>> -- >>>> Joel Hestness >>>> PhD Student, Computer Architecture >>>> Dept. of Computer Science, University of Wisconsin - Madison >>>> http://pages.cs.wisc.edu/~hestness/ >>>> >>>> _______________________________________________ >>>> gem5-users mailing list >>>> gem5-users@gem5.org >>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>> >>> >>> >>> _______________________________________________ >>> gem5-users mailing list >>> gem5-users@gem5.org >>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>> >> >> >> -- >> Joel Hestness >> PhD Student, Computer Architecture >> Dept. of Computer Science, University of Wisconsin - Madison >> http://pages.cs.wisc.edu/~hestness/ >> >> _______________________________________________ >> gem5-users mailing list >> gem5-users@gem5.org >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >> > > > _______________________________________________ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Wisconsin - Madison http://pages.cs.wisc.edu/~hestness/
_______________________________________________ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users