Just jumping in here, I can confirm I can't build the image anymore. I had assumed this was just a problem on my end before reading these emails. However, the image hosted at http://gcr.io/gem5-test/gcn-gpu should be the most up-to-date version of this Docker prior to this build error being introduced. It should work.
I've updated the website script here: https://gem5-review.googlesource.com/c/public/gem5-website/+/50807. Apologies, our documentation could definitely do with some tidying up :). -- Dr. Bobby R. Bruce Room 3050, Kemper Hall, UC Davis Davis, CA, 95616 web: https://www.bobbybruce.net On Wed, Sep 22, 2021 at 10:02 AM Imad Al Assir via gem5-users < gem5-users@gem5.org> wrote: > Dear Matt, > > Many thanks for catching this error! It did indeed solve the problem; I > was able to successfully run square and other applications from hip-samples > on both, the manually built dockerfile with everything related to rocBLAS > and MIOpen commented, and the pre-built docker image which I believe has > rocBLAS and MIOpen installed (based on its size). > > Many thanks again, > Imad > > On Sep 22 2021, at 6:48 pm, Poremba, Matthew <matthew.pore...@amd.com> > wrote: > > > [AMD Official Use Only] > > > > Hi Imad, > > > > > > Yes, the docker seems to have broken in the past few days. > > > > Regarding the benchmark not completing, please change your command to use > 3 CPUs: > > > > > > docker run --rm -v $PWD/gem5:/gem5 -v $PWD/gem5-resources:/gem5-resources \ > > -w /gem5 gcr.io/gem5-test/gcn-gpu \ > > build/GCN3_X86/gem5.opt configs/example/apu_se.py -n3 \ > > --benchmark-root=/gem5-resources/src/gpu/square/bin \ > > -c square > > > > ROCm 4.0 requires 3 CPUs to run now. I thought we had updated the > README.md and website before gem5 21.1 release to reflect this but looks > like they are not up to date. > > > > > > -Matt > > > > *From:* Imad Al Assir via gem5-users <gem5-users@gem5.org> > *Sent:* Wednesday, September 22, 2021 9:31 AM > *To:* Matt Sinclair <sincl...@cs.wisc.edu> > *Cc:* gem5 users mailing list <gem5-users@gem5.org>; Kyle Roarty < > kroa...@wisc.edu>; Imad Al Assir <imad.al.as...@upc.edu> > *Subject:* [gem5-users] Re: gem5 GCN GPU docker error > > > [CAUTION: External Email] > > Hello, > Thank you for your reply. I was simply following the documentation on the > gem5 website: > https://www.gem5.org/documentation/general_docs/gpu_models/GCN3 > <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.gem5.org%2Fdocumentation%2Fgeneral_docs%2Fgpu_models%2FGCN3&data=04%7C01%7Cmatthew.poremba%40amd.com%7C2675554a18524cefdd0008d97de67d9b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637679251172742925%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=izNVhdZSvEH7gisG849pkXAdKu2MtDMOt3aBbn9J26o%3D&reserved=0> > In other words, to build the image, I used: > docker build -t gcn-gpu . > > > This command didn't complete and was interrupted by the error I pasted in > the previous mail. > > > I was also using the command in the documentation to compile square: > docker run --rm -v $PWD/gem5-resources:$PWD/gem5-resources -w > $PWD/gem5-resources/src/gpu/square gcr.io/gem5-test/gcn-gpu make square > > > NOT "make gfx8-apu", as written in the documentation, which caused an > error: "no rule to make target 'gfx8-apu' ", and I assumed was a typo. > > > To run it, I also used the command in the doc: > docker run --rm -v $PWD/gem5:/gem5 -v $PWD/gem5-resources:/gem5-resources \ > -w /gem5 gcr.io/gem5-test/gcn-gpu \ > build/GCN3_X86/gem5.opt configs/example/apu_se.py -n2 \ > --benchmark-root=/gem5-resources/src/gpu/square/bin \ > -c square > > > Note that in these commands, I modified the path of square to ' > gem5-resources/src/gpu/square' instead of 'gem5-resources/src/square', > because that's where I found the code for it. > Also note that I tried downloading the pre-built binary of square (from > the gem5-resources website: http://resources.gem5.org/README > <https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fresources.gem5.org%2FREADME&data=04%7C01%7Cmatthew.poremba%40amd.com%7C2675554a18524cefdd0008d97de67d9b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637679251172752910%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=aoZN7pZU%2Be9m0dvaemraGLb0MEulGMRH%2FVExbRdyllI%3D&reserved=0>), > but the result was the same: application running indefinitely. > > > Thanks again for your help, > Imad > > > PS: If it helps, here are the last things printed when running square in > gem5 in the pre-built docker image: > > > [...] just warnings > > > gem5 Simulator System. http://gem5.org > gem5 is copyrighted software; use the --copyright option for details. > > > gem5 version 21.1.0.1 > gem5 compiled Sep 21 2021 14:52:55 > gem5 started Sep 22 2021 15:26:26 > gem5 executing on 8d532399b09e, pid 1 > command line: build/GCN3_X86/gem5.opt configs/example/apu_se.py -n2 > --benchmark-root=/gem5-resources/src/gpu/square/bin -c square > > > info: Standard input is not a terminal, disabling listeners. > Num SQC = 1 Num scalar caches = 1 Num CU = 4 > coalescer.slave is deprecated. `slave` is now called `in_ports` > warn: coalescer.slave is deprecated. `slave` is now called `in_ports` > warn: coalescer.slave is deprecated. `slave` is now called `in_ports` > > > [...] same warning as the one right above this line, repeated multiple > times > > > warn: system.ruby.network adopting orphan SimObject param 'ext_links' > warn: system.ruby.network adopting orphan SimObject param 'int_links' > build/GCN3_X86/sim/simulate.cc:107: info: Entering event queue @ 0. > Starting simulation... > build/GCN3_X86/mem/ruby/system/Sequencer.cc:573: warn: Replacement policy > updates recently became the responsibility of SLICC state machines. Make > sure to setMRU() near callbacks in .sm files! > build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall access(...) > build/GCN3_X86/sim/mem_state.cc:443: info: Increasing stack size by one > page. > build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...) > build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...) > build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...) > > > [...] same warning as above repeated multiple times > > > build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...) > build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...) > build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall > set_robust_list(...) > build/GCN3_X86/sim/syscall_emul.cc:84: warn: ignoring syscall > rt_sigaction(...) > (further warnings will be suppressed) > build/GCN3_X86/sim/syscall_emul.cc:84: warn: ignoring syscall > rt_sigprocmask(...) > (further warnings will be suppressed) > build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall > get_mempolicy(...) > build/GCN3_X86/arch/generic/debugfaults.hh:144: warn: MOVNTDQ: Ignoring > non-temporal hint, modeling as cacheable! > build/GCN3_X86/arch/x86/generated/exec-ns.cc.inc:27: warn: instruction > 'frndint' unimplemented > build/GCN3_X86/sim/mem_state.cc:443: info: Increasing stack size by one > page. > build/GCN3_X86/gpu-compute/gpu_compute_driver.cc:699: warn: unimplemented > ioctl: AMDKFD_IOC_ACQUIRE_VM > build/GCN3_X86/sim/syscall_emul.hh:1676: warn: mmap: writing to shared > mmap region is currently unsupported. The write succeeds on the target, but > it will not be propagated to the host or shared mappings > build/GCN3_X86/sim/mem_state.cc:443: info: Increasing stack size by one > page. > build/GCN3_X86/gpu-compute/gpu_compute_driver.cc:450: warn: Signal events > are only supported currently > build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...) > build/GCN3_X86/sim/power_state.cc:105: warn: PowerState: Already in the > requested power state, request ignored > build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall > set_robust_list(...) > build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...) > build/GCN3_X86/gpu-compute/gpu_compute_driver.cc:594: warn: unimplemented > ioctl: AMDKFD_IOC_SET_SCRATCH_BACKING_VA > build/GCN3_X86/gpu-compute/gpu_compute_driver.cc:604: warn: unimplemented > ioctl: AMDKFD_IOC_SET_TRAP_HANDLER > info: running on device > info: architecture on AMD GPU device is: 801 > info: allocate host and device mem ( 7.63 MB) > info: launch 'vector_square' kernel > build/GCN3_X86/sim/syscall_emul.cc:84: warn: ignoring syscall > sched_yield(...) > (further warnings will be suppressed) > build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...) > build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...) > > > On Sep 22 2021, at 5:17 pm, Matt Sinclair <sincl...@cs.wisc.edu> wrote: > > Hi Imad, > > I just built the docker earlier this week and did not have any problems > (e.g., I ran square and it completed in < 2 hours). How are you trying to > build it? And how are you running the applications you mentioned? > > Thanks, > Matt > > > On Wed, Sep 22, 2021 at 12:31 AM Imad Al Assir via gem5-users < > gem5-users@gem5.org> wrote: > > Hello, > Is there a problem with the most recent gcn-gpu docker file? > I tried building it several times on Ubuntu 20.04 and 18.04 but it kept > giving me this error: > > [...] > Unpacking rocblas (2.32.0-cc18d25f) ... > dpkg: dependency problems prevent configuration of rocblas: > rocblas depends on rocm-core; however: > Package rocm-core is not installed. > > > dpkg: error processing package rocblas (--install): > dependency problems - leaving unconfigured > dpkg: dependency problems prevent configuration of rocblas-dev: > rocblas-dev depends on rocblas (>= 2.32.0); however: > Package rocblas is not configured yet. > > > dpkg: error processing package rocblas-dev (--install): > dependency problems - leaving unconfigured > Errors were encountered while processing: > rocblas > rocblas-dev > + check_exit_code 1 > + (( 1 != 0 )) > + exit 1 > The command '/bin/sh -c ./install.sh -d -a all -i' returned a non-zero > code: 1 > > > I also tried downloading the pre-built docker image ( > gcr.io/gem5-test/gcn-gpu > <https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgcr.io%2Fgem5-test%2Fgcn-gpu&data=04%7C01%7Cmatthew.poremba%40amd.com%7C2675554a18524cefdd0008d97de67d9b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637679251172752910%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=y4gP%2BilM5v7tnvFpeOmXkXfgTdeI0PryYxQg3FCwsu0%3D&reserved=0>) > and built gem5 supposedly with no errors (but with a warning about > deprecated namespaces not being supported by the compiler). Then when I > tried running the 'square' sample application and other ones from > gem5-resources/src/gpu/hip-samples (e.g. MatrixTranspose, dynamic_shared, > inline_asm, etc.), they just kept running indefinitely (> 2 hours), and I > had to kill them to stop them. > > > May you please try building the latest version of the gcn-gpu dockerfile > and/or running a sample application on the pre-built docker image, and > inform us if it works, and if not, how to fix the problem? > > > Thanks in advance, > Imad Al Assir > _______________________________________________ > gem5-users mailing list -- gem5-users@gem5.org > To unsubscribe send an email to gem5-users-le...@gem5.org > %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s > > _______________________________________________ > gem5-users mailing list -- gem5-users@gem5.org > To unsubscribe send an email to gem5-users-le...@gem5.org > %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
_______________________________________________ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s