Hi Mahshid,
  Something you could do is to leverage the bash shell to get the same
checkpoint restore functionality as you should with the hack_back_ckpt.rcS
script.  For instance, if you execute the following at the terminal, the
checkpoint that you will collect should function similarly to the
hack_back_ckpt.rcS script:

 % /sbin/m5 checkpoint; echo "Loading new script..."; /sbin/m5 readfile >
/tmp/runscript; chmod 755 /tmp/runscript; if [ -s /tmp/runscript
]; then exec /tmp/runscript; else echo "Script not specified. Dropping back
to shell..."; fi

  Also, I suspect the trouble you're seeing with the large-scale CMP
checkpointing may have to do with a recently identified draining bug in the
SimpleMemory.  I'll cc you on that thread.

  Joel



On Wed, May 29, 2013 at 3:07 PM, Mahshid Sedghi <mahshid.sed...@gmail.com>wrote:

> Thanks a lot for your detailed and helpful answer Joel. Now everything is
> clear to me. I was able to use this script successfully for small systems.
> However, whenever I take checkpoint for a 64-core system using this script,
> the restoration fails since the draining cannot complete. This is strange,
> since I can successfully restore from a checkpoint (for a similar system)
> which was taken by attaching a terminal. These two different ways are
>  essentially doing the same, but for some reason the outcome is different.
> I have tried so many times, and it is never able to restore. I have to
> debug if further to figure out what the problem is.
>
> Thanks Joel,
> Mahshid
>
>
> On Wed, May 29, 2013 at 2:31 PM, Joel Hestness <jthestn...@gmail.com>wrote:
>
>> Hi Mahshid,
>>   The hack_back_ckpt.rcS script was specifically designed to give the
>> functionality of being able to take a checkpoint after Linux boot and then
>> restore with a different script.  It should work with any ISA and either
>> the classic or Ruby memory models.  In order to get "hacky" functionality,
>> you need to take a checkpoint using this script.  If you'd like more detail
>> on how this works, I'd recommend that you read the comments in
>> hack_back_ckpt.rcS.
>>
>>   Regardless of whether you are using hack_back_ckpt.rcS, when you take a
>> checkpoint and restore from it, the simulated system returns exactly to
>> whatever it was doing when you took the checkpoint.  Hence, if you take a
>> checkpoint by attaching a terminal and calling the m5 binary to take a
>> checkpoint, when you restore from that checkpoint, the system will finish
>> running the m5 binary and drop back to the terminal, as you would expect if
>> it hadn't taken a checkpoint.  If you annotate a benchmark to take a
>> checkpoint during simulation, when you restore from the checkpoint, the
>> benchmark will continue executing where it left off to take the checkpoint.
>>
>>   To your specific question:
>>
>> I mean if I take checkpoint without 
>> hack_back_ckpt.rcs<http://grok.gem5.org/xref/gem5/configs/boot/hack_back_ckpt.rcS>
>>  ,
>>> then while restoration "--script" does not work, and the simulated systems
>>> boots to shell and won't run the benchmark. Although in ruby_fs, script is
>>> passed to system.readfile, it is not running the script. Can you tell me
>>> how to get this to work?
>>>
>>
>>   You're right that the "--script" command "does not work."  This is
>> because the simulated system must be prompted to read the script that you
>> specify before executing it.  Again, you can see this in the
>> hack_back_ckpt.rcS.  Specifically, the following terminal commands use some
>> simulator magic to read the script, make it executable, and then run it:
>>
>> /sbin/m5 readfile > /tmp/runscript
>> chmod 755 /tmp/runscript
>>  exec /tmp/runscript
>>
>>   Hope this helps,
>>   Joel
>>
>>
>>
>>
>>> On Sat, May 25, 2013 at 4:04 PM, Joel Hestness <jthestn...@gmail.com>wrote:
>>>
>>>> Hi Mahshid,
>>>>   You can simply specify the script that you would like to run to the
>>>> --script= parameter of the fs.py or ruby_fs.py config file when you restore
>>>> from the checkpoint.
>>>>
>>>>   Joel
>>>>
>>>>
>>>> On Thu, May 23, 2013 at 8:26 PM, Mahshid Sedghi <
>>>> mahshid.sed...@gmail.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I am trying to restore from checkpoint and run my
>>>>> benchmark immediately without entering the commands in the simulated 
>>>>> system
>>>>> myself. I came across this thread and realized that 
>>>>> hack_back_ckpt.rcs<http://grok.gem5.org/xref/gem5/configs/boot/hack_back_ckpt.rcS>
>>>>>  is
>>>>> supposedly doing what I need. I was able to take checkpoints automatically
>>>>> using this script, but I'm not sure how to direct it to run my benchmark
>>>>> afterwards. Can anybody give me a clue on that?
>>>>>
>>>>> Thanks.
>>>>>
>>>>>
>>>>> On Sat, Sep 29, 2012 at 2:24 PM, Jun Pang <pang...@cs.duke.edu> wrote:
>>>>>
>>>>>> Hi Joel,
>>>>>>
>>>>>> Thank you very much. I really appreciate your help. I need to think
>>>>>> about this.
>>>>>>
>>>>>> Best,
>>>>>> Jun
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 28, 2012 at 8:22 PM, Joel Hestness 
>>>>>> <jthestn...@gmail.com>wrote:
>>>>>>
>>>>>>> Hi Jun,
>>>>>>>
>>>>>>> 1. Recompiled kernel x86_64-vmlinux-2.6.28.4.smp with SMP enabled
>>>>>>>> and CONFIG_NR_CPUS=512.
>>>>>>>>
>>>>>>>
>>>>>>> Ok.  If my memory serves me correctly, you will need to also change
>>>>>>> the CONFIG_NR_CPUS range in the kconfigs for kernel 2.6.28.4.  I don't
>>>>>>> recall the details on where to find these files, but the kconfigs 
>>>>>>> specify
>>>>>>> which variables a config file can contain and the appropriate ranges 
>>>>>>> within
>>>>>>> which the variables can be set.  I believe the upper limit in the 
>>>>>>> kconfigs
>>>>>>> for 2.6.28.4 was CONFIG_NR_CPUS=32.  If this is the case the system may
>>>>>>> look like it is still booting more than 32 cores, but Linux will only 
>>>>>>> show
>>>>>>> 32 of them as available after you've booted the system.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> 2.Made a checkpoint right before "Loading script..." and I guess
>>>>>>>> this is similar to what  /configs/boot/hack_back_ckpt.rcS does. I 
>>>>>>>> modified
>>>>>>>> the file system file: etc/init.d/rsC to create the checkpoint.
>>>>>>>>
>>>>>>>
>>>>>>> Cool, that works.  The /configs/boot/hack_back_ckpt.rcS script will
>>>>>>> also work with any disk image and any ISA for which gem5 has 
>>>>>>> checkpointing
>>>>>>> capability.  In the future, it might be easier to use than modifying the
>>>>>>> disk image to collect checkpoints.
>>>>>>>
>>>>>>>
>>>>>>> 3.The interrupt devices do give me problems.
>>>>>>>>     - Local and IO APIC have only 8 bits which means they will
>>>>>>>> support up to 256 cores, but the IO_APIC.id is set to be the numCPUs. 
>>>>>>>> This
>>>>>>>> means I cannot really use 256, because it is beyond 8 bits range. I 
>>>>>>>> tried
>>>>>>>> to make IO_APIC.id =numCPU-1 and boot 256 cores, but then I got error: 
>>>>>>>> "panic:
>>>>>>>> Legacy mode interrupts with error codes aren't implementde" when
>>>>>>>> the system boots the 256th core. However, it works correctly with 1 or 
>>>>>>>> 2
>>>>>>>> cores.
>>>>>>>>
>>>>>>>
>>>>>>> You're correct that the IO APIC reserves a bit in the ID mask so
>>>>>>> that only 7 bits are available (i.e. up to 128 cores).  I'm not sure 
>>>>>>> what
>>>>>>> it would take to remove this restriction, though I believe we 
>>>>>>> investigated
>>>>>>> using the extra bit and decided it was going to be difficult to make it
>>>>>>> work.  If you really need to use 256 cores, I would recommend starting 
>>>>>>> by
>>>>>>> reading up on the Intel i8042, i82094aa, i8237, i8254, and i8259 
>>>>>>> devices,
>>>>>>> and also trying to identify the new devices that have replaced these 
>>>>>>> (which
>>>>>>> may be capable of booting more than 128 cores in a single system).
>>>>>>>
>>>>>>>
>>>>>>>    -My ideal GPU clock rate is 5GHz, but I will get an error with
>>>>>>>> that frequency:"MP-BIOS bug: 8254 timer not connected to IO-APIC".
>>>>>>>> I have searched around and it seems to be an known issue with some 
>>>>>>>> kernels.
>>>>>>>> The easiest way to fix it is to disable APIC which is not allowed in 
>>>>>>>> SMP I
>>>>>>>> guess. So I reduced the frequency to be 4GHz....
>>>>>>>>
>>>>>>>
>>>>>>> I haven't run into this problem before, so I'm afraid I can't help
>>>>>>> here.
>>>>>>>
>>>>>>>
>>>>>>> Please let me know if what I have done so far makes sense or not.
>>>>>>>> Any suggestion will be greatly appreciated. I am still reading the 
>>>>>>>> code and
>>>>>>>> trying to understand the APIC part better.
>>>>>>>>
>>>>>>>
>>>>>>> Sounds good so far.  Let me know if there's anything else I can help
>>>>>>> with,
>>>>>>>
>>>>>>>   Joel
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> On Thu, Sep 27, 2012 at 3:14 PM, Joel Hestness <
>>>>>>>> jthestn...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Jun,
>>>>>>>>>   Can you let us know which kernel version you're working with or
>>>>>>>>> where you got the kernel that you're trying to boot?
>>>>>>>>>
>>>>>>>>>   There are instructions here (http://gem5.org/Linux_kernel) for
>>>>>>>>> building Linux for x86 and gem5.  Currently, there aren't any known 
>>>>>>>>> ways to
>>>>>>>>> speed up Linux boot, since the kernel waits for interrupts to 
>>>>>>>>> calibrate
>>>>>>>>> delay loops already.
>>>>>>>>>
>>>>>>>>>   Note also that there are probably some limitations to building
>>>>>>>>> kernels to support more than 64 or 256 cores.  Sometimes the kernel
>>>>>>>>> kconfigs set default ranges for things like core count that you cannot
>>>>>>>>> tweak just by changing the config files.  Further, the interrupt 
>>>>>>>>> devices
>>>>>>>>> currently in gem5 (./src/dev/x86/*) probably have a core count 
>>>>>>>>> limitation,
>>>>>>>>> since they are implemented to function the same as devices used in 
>>>>>>>>> current
>>>>>>>>> real hardware.  You might want to check into these, because if you're
>>>>>>>>> running into a core count limitation, something may be hung up in the
>>>>>>>>> system you're booting with 255 cores.
>>>>>>>>>
>>>>>>>>>   Overall, 3-5 hours to boot 64 cores doesn't sound unreasonable
>>>>>>>>> with the atomic or timing CPUs.  The out-of-order core will take 
>>>>>>>>> multiple
>>>>>>>>> times longer to boot Linux because of the added core detail, so we do 
>>>>>>>>> not
>>>>>>>>> suggest that you try this, but instead you take checkpoints after 
>>>>>>>>> Linux
>>>>>>>>> boot and restore from the checkpoint into the benchmarks you'd like 
>>>>>>>>> to run.
>>>>>>>>>  (Check out the runscript ./configs/boot/hack_back_ckpt.rcS, which 
>>>>>>>>> makes
>>>>>>>>> this checkpointing easy)
>>>>>>>>>
>>>>>>>>>   Hope this helps,
>>>>>>>>>   Joel
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Sep 27, 2012 at 12:42 PM, Jun Pang <pang...@cs.duke.edu>wrote:
>>>>>>>>>
>>>>>>>>>> It's been over one day already, and my 255 cores system is still
>>>>>>>>>> booting. Does anyone know the answer to speed up the booting?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Jun
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Sep 25, 2012 at 11:01 PM, Jun Pang 
>>>>>>>>>> <pang...@cs.duke.edu>wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi guys,
>>>>>>>>>>>
>>>>>>>>>>> I want to boot more than 256 cores in X86 FS with ruby, however,
>>>>>>>>>>> it's slow. More than 3 hours for ~60 cores so far without ruby...
>>>>>>>>>>>
>>>>>>>>>>> I read this from X86 session of
>>>>>>>>>>> http://gem5.org/Architecture_Support :
>>>>>>>>>>>
>>>>>>>>>>> "patches are available for speeding up boot"
>>>>>>>>>>>
>>>>>>>>>>> I have searched around, but couldn't find any patches so far.
>>>>>>>>>>> Does this refer to checkpoint and fast-forwarding? or there are 
>>>>>>>>>>> some real
>>>>>>>>>>> patches I haven't found yet. If so, I really appreciate it if 
>>>>>>>>>>> someone could
>>>>>>>>>>> point me to the right place to download the patches.
>>>>>>>>>>>
>>>>>>>>>>> Or are there any other techniques I can make the boot faster?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Jun
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> gem5-users mailing list
>>>>>>>>>> gem5-users@gem5.org
>>>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>   Joel Hestness
>>>>>>>>>   PhD Student, Computer Architecture
>>>>>>>>>   Dept. of Computer Science, University of Wisconsin - Madison
>>>>>>>>>   http://www.cs.utexas.edu/~hestness
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> gem5-users mailing list
>>>>>>>>> gem5-users@gem5.org
>>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> gem5-users mailing list
>>>>>>>> gem5-users@gem5.org
>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>   Joel Hestness
>>>>>>>   PhD Student, Computer Architecture
>>>>>>>   Dept. of Computer Science, University of Wisconsin - Madison
>>>>>>>   http://www.cs.utexas.edu/~hestness
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> gem5-users mailing list
>>>>>>> gem5-users@gem5.org
>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> gem5-users mailing list
>>>>>> gem5-users@gem5.org
>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> gem5-users mailing list
>>>>> gem5-users@gem5.org
>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>   Joel Hestness
>>>>   PhD Student, Computer Architecture
>>>>   Dept. of Computer Science, University of Wisconsin - Madison
>>>>   http://pages.cs.wisc.edu/~hestness/
>>>>
>>>> _______________________________________________
>>>> gem5-users mailing list
>>>> gem5-users@gem5.org
>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>>
>>>
>>>
>>> _______________________________________________
>>> gem5-users mailing list
>>> gem5-users@gem5.org
>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>
>>
>>
>> --
>>   Joel Hestness
>>   PhD Student, Computer Architecture
>>   Dept. of Computer Science, University of Wisconsin - Madison
>>   http://pages.cs.wisc.edu/~hestness/
>>
>> _______________________________________________
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>
>
> _______________________________________________
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>



-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Wisconsin - Madison
  http://pages.cs.wisc.edu/~hestness/
_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to