Hi Sam,

Sorry for the frustration. Writing better documentation is always #2 on the
priority list :(.

I always tell people not to trust any of the "options" to fs.py and se.py.
Those scripts have gotten so far beyond "out of hand" at this point that
they are almost useless. They are trying to be everything to everyone, and
they end up just being a mess of spaghetti code and confusion.

To take a checkpoint, you can add the following code to a python runscript:

m5.simulate(10000)
m5.checkpoint(<name of directory>)
m5.simulate(20000)
m5.checkpoint(<name of directory>)

I tested the above code by adding it to the
configs/learning_gem5/part1/two_level.py file.

*Maybe* this is what --take-checkpoints is doing. It's certainly what it
was *supposed* to do, but again, since this code has gotten so out of hand,
who knows if it's actually doing what it advertises.

If you want to use the m5ops to checkpoint, the code would look
something like the following (this isn't tested and it's off the top of my
head).

while 1:
  exit_event = m5.simulate()
  if exit_event.getCause() == 'checkpoint'):
    m5.checkpoint(m5.outdir + '/' + str(num))
  else:
    break

To restore from a checkpoint, pass the checkpoint directory as the only
parameter to m5.instantiate(ckpt_dir=<checkpoint directory>).

Hope this helps! If you're still experiencing a hang in this case, it's
probably a bug in the drain() code somewhere. You can try to use one of the
drain debug flags (I don't know exactly what these are... check gem5
--debug-help for a list of debug flags). Making the python runscript do
exactly what you expect will also help with debugging. When you control the
script, adding prints is easy, too!

Finally, the file src/python/m5/simulate.py may be helpful to figure out
what's going on when instantiating, simulating, checkpointing, etc.

Cheers,
Jason

On Wed, Sep 8, 2021 at 6:14 AM Thomas, Samuel via gem5-users <
gem5-users@gem5.org> wrote:

> Hi all,
>
> Just to follow up, because I can see that there have been some issues with
> not including all of the requisite issues in other threads, here is the
> full output from what I described above.
>
> gem5 Simulator System.  http://gem5.org
> gem5 is copyrighted software; use the --copyright option for details.
>
> gem5 version 21.1.0.0
> gem5 compiled Sep  7 2021 19:28:16
> gem5 started Sep  8 2021 09:09:11
> gem5 executing on sam-Precision-Tower-5810, pid 445665
> command line: build/X86/gem5.opt -d $CURR_DIR/debug
> $CURR_DIR/configs/example/fs.py --caches --l2cache --mem-type DDR3_1600_8x8
> --mem-size 2GB --meta-size 512kB --num-cpus 4 --disk-image $DISK_PATH
> --kernel $KERNEL_PATH --cpu-type $CPU_TYPE --script=$SCRIPT_PATH
> --l2_size=1MB --take-checkpoints=10000,20000
>
> warn: iobus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: bridge.master is deprecated. `master` is now called `mem_side_port`
> warn: membus.master is deprecated. `master` is now called `mem_side_ports`
> warn: bridge.slave is deprecated. `slave` is now called `cpu_side_port`
> warn: iobus.master is deprecated. `master` is now called `mem_side_ports`
> warn: apicbridge.slave is deprecated. `slave` is now called `cpu_side_port`
> warn: membus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: apicbridge.master is deprecated. `master` is now called
> `mem_side_port`
> warn: membus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: iobus.master is deprecated. `master` is now called `mem_side_ports`
> warn: membus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: tol2bus.master is deprecated. `master` is now called `mem_side_ports`
> warn: membus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: membus.master is deprecated. `master` is now called `mem_side_ports`
> warn: membus.master is deprecated. `master` is now called `mem_side_ports`
> warn: membus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: membus.master is deprecated. `master` is now called `mem_side_ports`
> warn: membus.master is deprecated. `master` is now called `mem_side_ports`
> warn: membus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: membus.master is deprecated. `master` is now called `mem_side_ports`
> warn: membus.master is deprecated. `master` is now called `mem_side_ports`
> warn: membus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
> warn: membus.master is deprecated. `master` is now called `mem_side_ports`
> warn: membus.master is deprecated. `master` is now called `mem_side_ports`
> warn: membus.slave is deprecated. `slave` is now called `cpu_side_ports`
> Global frequency set at 1000000000000 ticks per second
> warn: system.workload.acpi_description_table_pointer.rsdt adopting orphan
> SimObject param 'entries'
> [Detaching after fork from child process 445724]
> [Detaching after fork from child process 445725]
> build/X86/mem/mem_interface.cc:792: warn: DRAM device capacity (8192
> Mbytes) does not match the address range assigned (2048 Mbytes)
> build/X86/sim/kernel_workload.cc:46: info: kernel located at:
> /home/sam/Desktop/clean-gem5/gem5/_dist/binaries/x86_64-vmlinux-2.6.22.9
>       0: system.pc.south_bridge.cmos.rtc: Real-time clock set to Sun Jan
>  1 00:00:00 2012
> system.pc.com_1.device: Listening for connections on port 3464
> 0: system.remote_gdb: listening for remote gdb on port 7008
> build/X86/dev/intel_8254_timer.cc:125: warn: Reading current count from
> inactive timer.
> **** REAL SIMULATION ****
> build/X86/sim/simulate.cc:107: info: Entering event queue @ 0.  Starting
> simulation...
> Exiting @ tick 10000 because simulate() limit reached
> build/X86/sim/simulate.cc:107: info: Entering event queue @ 10000.
> Starting simulation...
>
>
> At this point, the program hangs, and occupies the ports until I manually
> reset it even after killing the terminal process. Does this sound like
> something anyone has seen before or can replicate? I feel like I'm going
> crazy, and am not even sure how to debug this...
>
> Best,
> Sam
>
> On Tue, Sep 7, 2021 at 9:56 AM Samuel Thomas <samuel_tho...@brown.edu>
> wrote:
>
>> Hi all,
>>
>> This is a very basic and perhaps silly question. I’m trying to take
>> checkpoints in a gem5 program so that I can debug a particular segment of
>> the execution more efficiently, but it seems that the flag seems to pause
>> the execution of the environment and not actually take any checkpoints.
>>
>> I’m currently working from commit
>> 87c121fd954ea5a6e6b0760d693a2e744c2200de (i.e., v21.1.0.0)
>>
>> And am running the following command line:
>>
>> build/X86/gem5.opt -d $CURR_DIR/debug $CURR_DIR/configs/example/fs.py
>> --caches --l2cache --mem-type DDR3_1600_8x8 --mem-size 2GB --meta-size
>> 512kB --num-cpus 4 --disk-image $DISK_PATH --kernel $KERNEL_PATH --cpu-type
>> $CPU_TYPE --script=$SCRIPT_PATH --l2_size=1MB --take-checkpoints=10000,20000
>>
>> I assumed that --take-checkpoints was the proper way to do this, but it
>> seems that the execution pauses at this point and no checkpoint files are
>> produced in the out directory. Is there something that I’m doing wrong or a
>> better way to go about doing this?
>>
>> Thanks for your help!
>>
>> Best,
>> Sam
>
> _______________________________________________
> gem5-users mailing list -- gem5-users@gem5.org
> To unsubscribe send an email to gem5-users-le...@gem5.org
> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

Reply via email to