Thanks. I will try to use `--reg-alloc-policy=dynamic`(I didn't specify a
specific policy, I just used the default policy). And I will further read the
trace.
Then, I am using the stable branch. The commit is:
```
commit 39f85b7a3be1ee0ff6e375c9791dd62d23eb8a3e (HEAD -> stable, tag:
v22.0.0.1, origin/stable, origin/master, origin/HEAD)
Author: Bobby R. Bruce <bbr...@ucdavis.edu>
Date: Sat Jun 18 04:59:02 2022 -0700
misc: Update version info to v22.0.0.1
```
------------------ Original ------------------
From:
"The gem5 Users mailing list"
<gem5-users@gem5.org>;
Date: Sun, Nov 6, 2022 02:55 AM
To: "The gem5 Users mailing list"<gem5-users@gem5.org>;
Cc: "1575883782"<1575883...@qq.com>;"Matt
Sinclair"<sincl...@cs.wisc.edu>;
Subject: [gem5-users] Re: Gem5 GCN3 (GPUCoalescer detected deadlock when
running pagerank.)
Hi,
Ultimately this message is telling you there is a deadlock in the cache
coherence protocol when running PageRank with the specifications you did.
To fix it, you would need to get a trace
(https://www.gem5.org/documentation/learning_gem5/part3/MSIdebugging/) and
look through to see what the problem is. If you do this and find a fix,
we definitely welcome any patches you may find to help with this!
Having said that, I??ve been trying to replicate your problem. However,
the input size you are running means that gem5 will be running for a while, so
it will take a while before I can say something more definitive. We do
test PageRank as part of the weekly tests, but not specifically for 16
CUs. What branch (stable vs. develop) are you using? Also, I
recommend using --reg-alloc-policy=dynamic, as this is a more realistic
register allocation policy than the simple one (which I can??t tell if you are
using or not). In the meantime, if you can answer the above questions,
that may help us debug.
Thanks,
Matt
From: 1575883782 via gem5-users <gem5-users@gem5.org>
Sent: Saturday, November 5, 2022 3:58 AM
To: gem5-users <gem5-users@gem5.org>
Cc: 1575883782 <1575883...@qq.com>
Subject: [gem5-users] Gem5 GCN3 (GPUCoalescer detected deadlock when running
pagerank.)
Hi, I was trying to run PageRank benchmark with its GCN3 GPU model. I
succeed running PageRank with 4 CUs, but when I run it with 16CUs, I met some
problems. The key error message is
"build/GCN3_X86/mem/ruby/system/GPUCoalescer.cc:292: warn: GPUCoalescer 10
Possible deadlock detected!" Was I missing something? I don't know how to solve
it. Someone could help me? 4CUs command line (default CU number is 4) ```
command line: build/GCN3_X86/gem5.opt -n 3 --mem-size=8GB
--benchmark-root=/home/ubuntu/lmy/gem5-gcn3/gem5-resources/src/gpu/pannotia -c
pagerank/bin/pagerank_spmv
'--options=/home/ubuntu/lmy/gem5-gcn3/gem5-resources/src/gpu/pannotia/pagerank/coAuthorsDBLP.graph
1' ``` 16CUs command line ``` command line: build/GCN3_X86/gem5.opt
configs/example/apu_se.py -n 3 --num-compute-units 16 --mem-size=8GB
--benchmark-root=/home/ubuntu/lmy/gem5-gcn3/gem5-resources/src/gpu/pannotia -c
pagerank/bin/pagerank_spmv
'--options=/home/ubuntu/lmy/gem5-resources/src/gpu/pannotia/pagerank/coAuthorsDBLP.graph
1' ``` gem5 version ``` gem5 version 22.0.0.1 gem5 compiled Jun 29 2022
10:34:02 gem5 started Nov 3 2022 14:32:39 gem5 executing on 1bcbbec61aaf,
pid 1287240 ``` Error message: ```
build/GCN3_X86/mem/ruby/system/GPUCoalescer.cc:292: warn: GPUCoalescer 10
Possible deadlock detected! Printing out 763 outstanding requests in the
coalesced table Addr: [0x3b8b1c0, line
0x3b8b1c0] Instruction sequence number:
16871
Type: LD
Number of associated packets: 2
Issue time: 1732620214000
Difference from current tick: 280298000 Addr: [0x3b8b300, line 0x3b8b300]
Instruction sequence number: 16871
Type: LD
Number of associated packets: 3
Issue time: 1732620214000
Difference from current tick: 280298000 Addr: [0x3b8b380, line 0x3b8b380]
Instruction sequence number: 16871
Type: LD
Number of associated packets: 1
Issue time: 1732620214000
Difference from current tick: 280298000 Addr: [0x3b8b3c0, line 0x3b8b3c0]
Instruction sequence number: 16871
Type: LD
Number of associated packets: 3
Issue time: 1732620214000
Difference from current tick: 280298000 Addr: [0x3b8b440, line 0x3b8b440]
Instruction sequence number: 16871
Type: LD
Number of associated packets: 1
Issue time: 1732620214000
Difference from current tick: 280298000 Addr: [0x3b8b480, line 0x3b8b480]
Instruction sequence number: 16871
Type: LD
Number of associated packets: 2
Issue time: 1732620214000
Difference from current tick: 280298000 Addr: [0x3b8b4c0, line 0x3b8b4c0]
Instruction sequence number: 16871
Type: LD
Number of associated packets: 1
Issue time: 1732620214000
Difference from current tick: 280298000 Addr: [0x3b8b540, line 0x3b8b540]
Instruction sequence number: 16871
Type: LD
Number of associated packets: 1
Issue time: 1732620214000
Difference from current tick: 280298000 Addr: [0x3b8b5c0, line 0x3b8b5c0]
Instruction sequence number: 16871
Type: LD
Number of associated packets: 2
Issue time: 1732620214000
Difference from current tick: 280298000 Addr: [0x3b8b680, line 0x3b8b680]
Instruction sequence number: 16871
Type: LD
Number of associated packets: 1
Issue time: 1732620214000
Difference from current tick: 280298000 Addr: [0x3b8b740, line 0x3b8b740]
Instruction sequence number: 16871
Type: LD
Number of associated packets: 3
Issue time: 1732620214000
Difference from current tick: 280298000 Addr: [0x3b8b7c0, line 0x3b8b7c0]
...................................
Difference from current tick: 17915000 Addr: [0x4c60b40, line 0x4c60b40]
Instruction sequence number: 16552
Type: LD
Number of associated packets: 1
Issue time: 1732882652000
Difference from current tick: 17860000Listing pending packets from 0
instructions build/GCN3_X86/mem/ruby/system/GPUCoalescer.cc:294: panic:
Aborting due to deadlock! Memory Usage: 19939216 KBytes Program aborted at tick
1732900512000 --- BEGIN LIBC BACKTRACE ---
/home/ubuntu/lmy/gem5-gcn3/gem5/build/GCN3_X86/gem5.opt(+0x4fb330)[0x55f2ea122330]
/home/ubuntu/lmy/gem5-gcn3/gem5/build/GCN3_X86/gem5.opt(+0x5297ee)[0x55f2ea1507ee]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x143c0)[0x7fe799cb63c0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7fe798e5e03b]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7fe798e3d859]
/home/ubuntu/lmy/gem5-gcn3/gem5/build/GCN3_X86/gem5.opt(+0x512b15)[0x55f2ea139b15]
/home/ubuntu/lmy/gem5-gcn3/gem5/build/GCN3_X86/gem5.opt(+0xffa194)[0x55f2eac21194]
/home/ubuntu/lmy/gem5-gcn3/gem5/build/GCN3_X86/gem5.opt(+0x515ed2)[0x55f2ea13ced2]
/home/ubuntu/lmy/gem5-gcn3/gem5/build/GCN3_X86/gem5.opt(+0x553944)[0x55f2ea17a944]
/home/ubuntu/lmy/gem5-gcn3/gem5/build/GCN3_X86/gem5.opt(+0x55469e)[0x55f2ea17b69e]
/home/ubuntu/lmy/gem5-gcn3/gem5/build/GCN3_X86/gem5.opt(+0x1c5b422)[0x55f2eb882422]
/home/ubuntu/lmy/gem5-gcn3/gem5/build/GCN3_X86/gem5.opt(+0x4a3e27)[0x55f2ea0cae27]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x2a8738)[0x7fe799f6f738]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x8dd8)[0x7fe799d44f48]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x8fb)[0x7fe799e91e3b]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x94)[0x7fe799f6f114]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x74d6d)[0x7fe799d3bd6d]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x7d86)[0x7fe799d43ef6]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x8fb)[0x7fe799e91e3b]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(PyEval_EvalCodeEx+0x42)[0x7fe799e921c2]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(PyEval_EvalCode+0x1f)[0x7fe799e925af]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x1cfbf1)[0x7fe799e96bf1]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x25f537)[0x7fe799f26537]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x74d6d)[0x7fe799d3bd6d]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x12fd)[0x7fe799d3d46d]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x8006b)[0x7fe799d4706b]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(PyVectorcall_Call+0x60)[0x7fe799f6f830]
/home/ubuntu/lmy/gem5-gcn3/gem5/build/GCN3_X86/gem5.opt(+0x52b704)[0x55f2ea152704]
/home/ubuntu/lmy/gem5-gcn3/gem5/build/GCN3_X86/gem5.opt(+0x423666)[0x55f2ea04a666]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fe798e3f0b3]
/home/ubuntu/lmy/gem5-gcn3/gem5/build/GCN3_X86/gem5.opt(+0x492f0e)[0x55f2ea0b9f0e]
--- END LIBC BACKTRACE --- ```
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org