Hi Jason,

Thanks for your help. I think I've honed in on the source of the problem --
namely, number of cpus. Is there a reason why having multiple CPUs in a
particular configuration would limit the simulator's ability to write a
checkpoint?

Again, thank you for your help!

Best,
Sam

On Wed, Sep 8, 2021 at 11:12 AM Jason Lowe-Power <ja...@lowepower.com>
wrote:

> Hi Sam,
>
> Sorry for the frustration. Writing better documentation is always #2 on
> the priority list :(.
>
> I always tell people not to trust any of the "options" to fs.py and se.py.
> Those scripts have gotten so far beyond "out of hand" at this point that
> they are almost useless. They are trying to be everything to everyone, and
> they end up just being a mess of spaghetti code and confusion.
>
> To take a checkpoint, you can add the following code to a python runscript:
>
> m5.simulate(10000)
> m5.checkpoint(<name of directory>)
> m5.simulate(20000)
> m5.checkpoint(<name of directory>)
>
> I tested the above code by adding it to the
> configs/learning_gem5/part1/two_level.py file.
>
> *Maybe* this is what --take-checkpoints is doing. It's certainly what it
> was *supposed* to do, but again, since this code has gotten so out of hand,
> who knows if it's actually doing what it advertises.
>
> If you want to use the m5ops to checkpoint, the code would look
> something like the following (this isn't tested and it's off the top of my
> head).
>
> while 1:
>   exit_event = m5.simulate()
>   if exit_event.getCause() == 'checkpoint'):
>     m5.checkpoint(m5.outdir + '/' + str(num))
>   else:
>     break
>
> To restore from a checkpoint, pass the checkpoint directory as the only
> parameter to m5.instantiate(ckpt_dir=<checkpoint directory>).
>
> Hope this helps! If you're still experiencing a hang in this case, it's
> probably a bug in the drain() code somewhere. You can try to use one of the
> drain debug flags (I don't know exactly what these are... check gem5
> --debug-help for a list of debug flags). Making the python runscript do
> exactly what you expect will also help with debugging. When you control the
> script, adding prints is easy, too!
>
> Finally, the file src/python/m5/simulate.py may be helpful to figure out
> what's going on when instantiating, simulating, checkpointing, etc.
>
> Cheers,
> Jason
>
> On Wed, Sep 8, 2021 at 6:14 AM Thomas, Samuel via gem5-users <
> gem5-users@gem5.org> wrote:
>
>> Hi all,
>>
>> Just to follow up, because I can see that there have been some issues
>> with not including all of the requisite issues in other threads, here is
>> the full output from what I described above.
>>
>> gem5 Simulator System.  http://gem5.org
>> gem5 is copyrighted software; use the --copyright option for details.
>>
>> gem5 version 21.1.0.0
>> gem5 compiled Sep  7 2021 19:28:16
>> gem5 started Sep  8 2021 09:09:11
>> gem5 executing on sam-Precision-Tower-5810, pid 445665
>> command line: build/X86/gem5.opt -d $CURR_DIR/debug
>> $CURR_DIR/configs/example/fs.py --caches --l2cache --mem-type DDR3_1600_8x8
>> --mem-size 2GB --meta-size 512kB --num-cpus 4 --disk-image $DISK_PATH
>> --kernel $KERNEL_PATH --cpu-type $CPU_TYPE --script=$SCRIPT_PATH
>> --l2_size=1MB --take-checkpoints=10000,20000
>>
>> warn: iobus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: bridge.master is deprecated. `master` is now called `mem_side_port`
>> warn: membus.master is deprecated. `master` is now called `mem_side_ports`
>> warn: bridge.slave is deprecated. `slave` is now called `cpu_side_port`
>> warn: iobus.master is deprecated. `master` is now called `mem_side_ports`
>> warn: apicbridge.slave is deprecated. `slave` is now called
>> `cpu_side_port`
>> warn: membus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: apicbridge.master is deprecated. `master` is now called
>> `mem_side_port`
>> warn: membus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: iobus.master is deprecated. `master` is now called `mem_side_ports`
>> warn: membus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: tol2bus.master is deprecated. `master` is now called
>> `mem_side_ports`
>> warn: membus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: membus.master is deprecated. `master` is now called `mem_side_ports`
>> warn: membus.master is deprecated. `master` is now called `mem_side_ports`
>> warn: membus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: membus.master is deprecated. `master` is now called `mem_side_ports`
>> warn: membus.master is deprecated. `master` is now called `mem_side_ports`
>> warn: membus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: membus.master is deprecated. `master` is now called `mem_side_ports`
>> warn: membus.master is deprecated. `master` is now called `mem_side_ports`
>> warn: membus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: tol2bus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> warn: membus.master is deprecated. `master` is now called `mem_side_ports`
>> warn: membus.master is deprecated. `master` is now called `mem_side_ports`
>> warn: membus.slave is deprecated. `slave` is now called `cpu_side_ports`
>> Global frequency set at 1000000000000 ticks per second
>> warn: system.workload.acpi_description_table_pointer.rsdt adopting orphan
>> SimObject param 'entries'
>> [Detaching after fork from child process 445724]
>> [Detaching after fork from child process 445725]
>> build/X86/mem/mem_interface.cc:792: warn: DRAM device capacity (8192
>> Mbytes) does not match the address range assigned (2048 Mbytes)
>> build/X86/sim/kernel_workload.cc:46: info: kernel located at:
>> /home/sam/Desktop/clean-gem5/gem5/_dist/binaries/x86_64-vmlinux-2.6.22.9
>>       0: system.pc.south_bridge.cmos.rtc: Real-time clock set to Sun Jan
>>  1 00:00:00 2012
>> system.pc.com_1.device: Listening for connections on port 3464
>> 0: system.remote_gdb: listening for remote gdb on port 7008
>> build/X86/dev/intel_8254_timer.cc:125: warn: Reading current count from
>> inactive timer.
>> **** REAL SIMULATION ****
>> build/X86/sim/simulate.cc:107: info: Entering event queue @ 0.  Starting
>> simulation...
>> Exiting @ tick 10000 because simulate() limit reached
>> build/X86/sim/simulate.cc:107: info: Entering event queue @ 10000.
>> Starting simulation...
>>
>>
>> At this point, the program hangs, and occupies the ports until I manually
>> reset it even after killing the terminal process. Does this sound like
>> something anyone has seen before or can replicate? I feel like I'm going
>> crazy, and am not even sure how to debug this...
>>
>> Best,
>> Sam
>>
>> On Tue, Sep 7, 2021 at 9:56 AM Samuel Thomas <samuel_tho...@brown.edu>
>> wrote:
>>
>>> Hi all,
>>>
>>> This is a very basic and perhaps silly question. I’m trying to take
>>> checkpoints in a gem5 program so that I can debug a particular segment of
>>> the execution more efficiently, but it seems that the flag seems to pause
>>> the execution of the environment and not actually take any checkpoints.
>>>
>>> I’m currently working from commit
>>> 87c121fd954ea5a6e6b0760d693a2e744c2200de (i.e., v21.1.0.0)
>>>
>>> And am running the following command line:
>>>
>>> build/X86/gem5.opt -d $CURR_DIR/debug $CURR_DIR/configs/example/fs.py
>>> --caches --l2cache --mem-type DDR3_1600_8x8 --mem-size 2GB --meta-size
>>> 512kB --num-cpus 4 --disk-image $DISK_PATH --kernel $KERNEL_PATH --cpu-type
>>> $CPU_TYPE --script=$SCRIPT_PATH --l2_size=1MB --take-checkpoints=10000,20000
>>>
>>> I assumed that --take-checkpoints was the proper way to do this, but it
>>> seems that the execution pauses at this point and no checkpoint files are
>>> produced in the out directory. Is there something that I’m doing wrong or a
>>> better way to go about doing this?
>>>
>>> Thanks for your help!
>>>
>>> Best,
>>> Sam
>>
>> _______________________________________________
>> gem5-users mailing list -- gem5-users@gem5.org
>> To unsubscribe send an email to gem5-users-le...@gem5.org
>> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
>
>
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

Reply via email to