Thanks, I have successfully run bwd_activation without error.
------------------ ???????? ------------------
??????:
"gem5 users mailing list"
<gem5-users@gem5.org>;
????????: 2022??2??13??(??????) ????6:30
??????: "gem5 users mailing list"<gem5-users@gem5.org>;
????: "1575883782"<1575883...@qq.com>;"Kyle
Roarty"<kroa...@wisc.edu>;"Matt Sinclair"<mattdsinclair.w...@gmail.com>;
????: [gem5-users] Re: Gem5 GCN3 DNNMark benchmark error (fwd_softmax is
ok, but others are not)
Thanks this is helpful. Kyle and I went through the error and we haven't
run on a machine with enough memory to run batch size 100 (which is what
bwd_activation assumes by default). However, we have gotten it to run
with up to batch size 50.
We think the failure you were seeing was essentially happening because we
weren't testing bwd_activation in the nightly/weekly regressions, and thus
missed that the file we use to generate the MIOpen cachefiles for the DNNMark
kernels did not have the appropriate kernel for bwd_activation. Kyle
created a patch to fix this problem:
https://gem5-review.googlesource.com/c/public/gem5-resources/+/56789.
You will need to pull this patch and rerun generate_cachefiles before trying to
run again. Moreover, since we only know it works up to batch size 50, you
may consider changing the batch size here:
https://gem5.googlesource.com/public/gem5-resources/+/refs/heads/stable/src/gpu/DNNMark/config_example/activation_config.dnnmark#6,
to something <= 50 since N represents the batch size. Alternatively if
you need > 50 batch size, you can try running again on the larger machine
you mentioned before, but since we haven't run it on such a large machine yet
we don't know exactly what will happen.
Hope this helps,
Matt
On Fri, Feb 11, 2022 at 12:11 PM 1575883782 via gem5-users
<gem5-users@gem5.org> wrote:
yeah, I running DNNMark inside docker, and the version is v21-2. I run command
by remote-container plugin of VsCode.
---Original---
From: "Matt Sinclair via gem5-users"<gem5-users@gem5.org>
Date: Sat, Feb 12, 2022 01:41 AM
To: "gem5 users mailing list"<gem5-users@gem5.org>;
Cc: "1575883782"<1575883...@qq.com>;"Kyle Roarty"<kroa...@wisc.edu>;"Matt
Sinclair"<mattdsinclair.w...@gmail.com>;
Subject: [gem5-users] Re: Gem5 GCN3 DNNMark benchmark error (fwd_softmax is ok,
but others are not)
One more question for you, original poster: are you running DNNMark inside the
docker resources we provided: http://resources.gem5.org/resources/dnn-mark?
Or are you trying to get this running on your machine directly?
Matt
On Fri, Feb 11, 2022 at 11:37 AM Matt Sinclair
<mattdsinclair.w...@gmail.com> wrote:
Kyle, can you please help with this? I don't recall when we last tested
bwd_act.
Matt
On Fri, Feb 11, 2022 at 2:18 AM 1575883782 via gem5-users
<gem5-users@gem5.org> wrote:
Hi, I was trying to run DNNMark benchmark with its GCN3 GPU model following the
instructions on http://resources.gem5.org/resources/dnn-mark.I succeed running
fwd_softmax, but when I run other layers, I met some problems. For example,
"bwd_activation".
I tried to run gem5 DNNMark bwd_activation bechmark in 2 computers.
First computer has 32G Mem size. Gem5 could run fwd_softmax successfully, but
always was killed while running bwd_activation. The error message was "Killed"
+ process id. No other messages. I guess it's as this computer's mem size is
not enough to run it.
Second computer has 256G Mem size. Gem5 could run fwd_softmax successfully. But
some problems happened while running bwd_activation. I solved some, but have
not solved all. Error messages are:
> I0909 01:46:50.680040 100 dnn_wrapper.h:341] enter
dnnmarkActivationBackward func > build/GCN3_X86/sim/mem_pool.cc:110: warn:
Reached m5ops MMIO region > build/GCN3_X86/sim/mem_pool.cc:110: warn:
Reached m5ops MMIO region > build/GCN3_X86/sim/mem_pool.cc:110: warn:
Reached m5ops MMIO region > build/GCN3_X86/sim/mem_pool.cc:110: warn:
Reached m5ops MMIO region > build/GCN3_X86/arch/x86/faults.cc:170: panic:
Tried to read unmapped address 0. > PC: 0x7fffeef84b80, Instr: FMUL2_M :
ldfp87 %ufp1, DS:[rdx] > Memory Usage: 46436124 KBytes > Program
aborted at tick 10680071080500 >
sometimes, error are:> panic: Tried to write unmapped address
0x2b95d881.or> panic: Tried to write unmapped address 0x3.
According to my log, I found the problem happended on
"dnnmarkActivationBackward" func.> LOG(INFO) << "enter
dnnmarkActivationBackward func"; > #ifdef AMD_MIOPEN >
MIOPEN_CALL(miopenActivationBackward( > mode == COMPOSED ?
> handle.GetMIOpen(idx) : handle.GetMIOpen(), >
activation_desc.Get(), > alpha, >
top_desc.Get(), y, > top_desc.Get(), dy, >
bottom_desc.Get(), x, > beta, >
bottom_desc.Get(), dx)); > #endif > LOG(INFO) << "exit
dnnmarkActivationBackward func";
It seems to be a miopen interface functions. I don't know how to solve it.
Someone could help me?
PS: my gem5 version is v21-2, and docker image is v21-2.my run command is:
build/GCN3_X86/gem5.opt --outdir=$outdir configs/example/apu_se.py -n 10
--mem-size=8GB --benchmark-root=$BenchmarkRoot/test_bwd_activation -c
dnnmark_test_bwd_activation --options="-config
"$ConfigRoot"/activation_config.dnnmark -mmap "$MMAPFile" -debuginfo 1"Both
computers have no AMD GPU._______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s