[AMD Official Use Only]

Hi Imad,


Yes, the docker seems to have broken in the past few days.

Regarding the benchmark not completing, please change your command to use 3 
CPUs:


docker run --rm -v $PWD/gem5:/gem5 -v $PWD/gem5-resources:/gem5-resources \
                -w /gem5 gcr.io/gem5-test/gcn-gpu \
                build/GCN3_X86/gem5.opt configs/example/apu_se.py -n3 \
                --benchmark-root=/gem5-resources/src/gpu/square/bin \
                -c square

ROCm 4.0 requires 3 CPUs to run now.  I thought we had updated the README.md 
and website before gem5 21.1 release to reflect this but looks like they are 
not up to date.


-Matt

From: Imad Al Assir via gem5-users <gem5-users@gem5.org>
Sent: Wednesday, September 22, 2021 9:31 AM
To: Matt Sinclair <sincl...@cs.wisc.edu>
Cc: gem5 users mailing list <gem5-users@gem5.org>; Kyle Roarty 
<kroa...@wisc.edu>; Imad Al Assir <imad.al.as...@upc.edu>
Subject: [gem5-users] Re: gem5 GCN GPU docker error

[CAUTION: External Email]
Hello,
Thank you for your reply. I was simply following the documentation on the gem5 
website: 
https://www.gem5.org/documentation/general_docs/gpu_models/GCN3<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.gem5.org%2Fdocumentation%2Fgeneral_docs%2Fgpu_models%2FGCN3&data=04%7C01%7Cmatthew.poremba%40amd.com%7C2675554a18524cefdd0008d97de67d9b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637679251172742925%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=izNVhdZSvEH7gisG849pkXAdKu2MtDMOt3aBbn9J26o%3D&reserved=0>
In other words, to build the image, I used:
 docker build -t gcn-gpu .

This command didn't complete and was interrupted by the error I pasted in the 
previous mail.

I was also using the command in the documentation to compile square:
docker run --rm -v $PWD/gem5-resources:$PWD/gem5-resources -w 
$PWD/gem5-resources/src/gpu/square gcr.io/gem5-test/gcn-gpu make square

NOT "make gfx8-apu", as written in the documentation, which caused an error: 
"no rule to make target 'gfx8-apu' ", and I assumed was a typo.

To run it, I also used the command in the doc:
docker run --rm -v $PWD/gem5:/gem5 -v $PWD/gem5-resources:/gem5-resources \
                -w /gem5 gcr.io/gem5-test/gcn-gpu \
                build/GCN3_X86/gem5.opt configs/example/apu_se.py -n2 \
                --benchmark-root=/gem5-resources/src/gpu/square/bin \
                -c square

Note that in these commands, I modified the path of square to 
'gem5-resources/src/gpu/square' instead of 'gem5-resources/src/square', because 
that's where I found the code for it.
Also note that I tried downloading the pre-built binary of square (from the 
gem5-resources website: 
http://resources.gem5.org/README<https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fresources.gem5.org%2FREADME&data=04%7C01%7Cmatthew.poremba%40amd.com%7C2675554a18524cefdd0008d97de67d9b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637679251172752910%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=aoZN7pZU%2Be9m0dvaemraGLb0MEulGMRH%2FVExbRdyllI%3D&reserved=0>),
 but the result was the same: application running indefinitely.

Thanks again for your help,
Imad

PS: If it helps, here are the last things printed when running square in gem5 
in the pre-built docker image:

[...] just warnings

gem5 Simulator System.  http://gem5.org
gem5 is copyrighted software; use the --copyright option for details.

gem5 version 21.1.0.1
gem5 compiled Sep 21 2021 14:52:55
gem5 started Sep 22 2021 15:26:26
gem5 executing on 8d532399b09e, pid 1
command line: build/GCN3_X86/gem5.opt configs/example/apu_se.py -n2 
--benchmark-root=/gem5-resources/src/gpu/square/bin -c square

info: Standard input is not a terminal, disabling listeners.
Num SQC =  1 Num scalar caches =  1 Num CU =  4
coalescer.slave is deprecated. `slave` is now called `in_ports`
warn: coalescer.slave is deprecated. `slave` is now called `in_ports`
warn: coalescer.slave is deprecated. `slave` is now called `in_ports`

[...] same warning as the one right above this line, repeated multiple times

warn: system.ruby.network adopting orphan SimObject param 'ext_links'
warn: system.ruby.network adopting orphan SimObject param 'int_links'
build/GCN3_X86/sim/simulate.cc:107: info: Entering event queue @ 0.  Starting 
simulation...
build/GCN3_X86/mem/ruby/system/Sequencer.cc:573: warn: Replacement policy 
updates recently became the responsibility of SLICC state machines. Make sure 
to setMRU() near callbacks in .sm files!
build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall access(...)
build/GCN3_X86/sim/mem_state.cc:443: info: Increasing stack size by one page.
build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...)
build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...)
build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...)

[...] same warning as above repeated multiple times

build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...)
build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...)
build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall 
set_robust_list(...)
build/GCN3_X86/sim/syscall_emul.cc:84: warn: ignoring syscall rt_sigaction(...)
      (further warnings will be suppressed)
build/GCN3_X86/sim/syscall_emul.cc:84: warn: ignoring syscall 
rt_sigprocmask(...)
      (further warnings will be suppressed)
build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall get_mempolicy(...)
build/GCN3_X86/arch/generic/debugfaults.hh:144: warn: MOVNTDQ: Ignoring 
non-temporal hint, modeling as cacheable!
build/GCN3_X86/arch/x86/generated/exec-ns.cc.inc:27: warn: instruction 
'frndint' unimplemented
build/GCN3_X86/sim/mem_state.cc:443: info: Increasing stack size by one page.
build/GCN3_X86/gpu-compute/gpu_compute_driver.cc:699: warn: unimplemented 
ioctl: AMDKFD_IOC_ACQUIRE_VM
build/GCN3_X86/sim/syscall_emul.hh:1676: warn: mmap: writing to shared mmap 
region is currently unsupported. The write succeeds on the target, but it will 
not be propagated to the host or shared mappings
build/GCN3_X86/sim/mem_state.cc:443: info: Increasing stack size by one page.
build/GCN3_X86/gpu-compute/gpu_compute_driver.cc:450: warn: Signal events are 
only supported currently
build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...)
build/GCN3_X86/sim/power_state.cc:105: warn: PowerState: Already in the 
requested power state, request ignored
build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall 
set_robust_list(...)
build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...)
build/GCN3_X86/gpu-compute/gpu_compute_driver.cc:594: warn: unimplemented 
ioctl: AMDKFD_IOC_SET_SCRATCH_BACKING_VA
build/GCN3_X86/gpu-compute/gpu_compute_driver.cc:604: warn: unimplemented 
ioctl: AMDKFD_IOC_SET_TRAP_HANDLER
info: running on device
info: architecture on AMD GPU device is: 801
info: allocate host and device mem (  7.63 MB)
info: launch 'vector_square' kernel
build/GCN3_X86/sim/syscall_emul.cc:84: warn: ignoring syscall sched_yield(...)
      (further warnings will be suppressed)
build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...)
build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...)

On Sep 22 2021, at 5:17 pm, Matt Sinclair 
<sincl...@cs.wisc.edu<mailto:sincl...@cs.wisc.edu>> wrote:
Hi Imad,

I just built the docker earlier this week and did not have any problems (e.g., 
I ran square and it completed in < 2 hours).  How are you trying to build it?  
And how are you running the applications you mentioned?

Thanks,
Matt

On Wed, Sep 22, 2021 at 12:31 AM Imad Al Assir via gem5-users 
<gem5-users@gem5.org<mailto:gem5-users@gem5.org>> wrote:
Hello,
Is there a problem with the most recent gcn-gpu docker file?
I tried building it several times on Ubuntu 20.04 and 18.04 but it kept giving 
me this error:

[...]
Unpacking rocblas (2.32.0-cc18d25f) ...
dpkg: dependency problems prevent configuration of rocblas:
 rocblas depends on rocm-core; however:
  Package rocm-core is not installed.

dpkg: error processing package rocblas (--install):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of rocblas-dev:
 rocblas-dev depends on rocblas (>= 2.32.0); however:
  Package rocblas is not configured yet.

dpkg: error processing package rocblas-dev (--install):
 dependency problems - leaving unconfigured
Errors were encountered while processing:
 rocblas
 rocblas-dev
+ check_exit_code 1
+ ((  1 != 0  ))
+ exit 1
The command '/bin/sh -c ./install.sh -d -a all -i' returned a non-zero code: 1

I also tried downloading the pre-built docker image 
(gcr.io/gem5-test/gcn-gpu<https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgcr.io%2Fgem5-test%2Fgcn-gpu&data=04%7C01%7Cmatthew.poremba%40amd.com%7C2675554a18524cefdd0008d97de67d9b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637679251172752910%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=y4gP%2BilM5v7tnvFpeOmXkXfgTdeI0PryYxQg3FCwsu0%3D&reserved=0>)
 and built gem5 supposedly with no errors (but with a warning about deprecated 
namespaces not being supported by the compiler). Then when I tried running the 
'square' sample application and other ones from 
gem5-resources/src/gpu/hip-samples (e.g. MatrixTranspose, dynamic_shared, 
inline_asm, etc.), they just kept running indefinitely (> 2 hours), and I had 
to kill them to stop them.

May you please try building the latest version of the gcn-gpu dockerfile and/or 
running a sample application on the pre-built docker image, and inform us if it 
works, and if not, how to fix the problem?

Thanks in advance,
Imad Al Assir
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org<mailto:gem5-users@gem5.org>
To unsubscribe send an email to 
gem5-users-le...@gem5.org<mailto:gem5-users-le...@gem5.org>
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

Reply via email to