Thanks, this is helpful. Looking through those old email chains, I don't see any specific resolution to them, unfortunately. I do not have a ton of time to dig into this (end of the semester is keeping me busy) but if you can keep digging I may be able to provide some ideas.
First, are you still seeing this error: MIOpen(HIP): Warning [ParseAndLoadDb] File is unreadable: /opt/rocm-4.0.1/miopen/share/miopen/db/gfx801_4.HIP.fdb.txt after making the above change to include the MIOpen cache? Second, what layer size are you assuming/trying to conv? The failure shortly after is here: https://github.com/ROCmSoftwarePlatform/MIOpen/blob/rocm-4.0.1-release/src/ocl/convolutionocl.cpp#L149. It seems to imply that X and W are not equal, but we'd need to dig to figure out if this is because of the config file being passed in, or something in MIOpen/gem5 that is breaking it. Given that the function call that is failing is trying to create a tensor: https://github.com/shidong-ai/DNNMark/blob/develop/core/include/dnn_utility.h#L106, my guess is that it's something with the config, because something so basic probably (hopefully?) doesn't fail in MIOpen... In terms of if it should work or not, I don't see that we included it in prior papers (e.g., https://www.gem5.org/assets/files/papers/enabling2021ispass.pdf) but I don't know if that was because we didn't try or if there was a deeper, more fundamental reason). Matt On Sat, May 13, 2023 at 10:34 AM 429442672 <429442...@qq.com> wrote: > Thank you so much for advice. > > Acturally, i have made the cachefiles as shown in the figure. > > Besides, i have succesfully run several benchmarks such as pool, > activations, softmax, so i think the kernels is setuped. > The reason of the differences in commands is that i directly run the > command in the docker container. > > There might be a common problem when running the network with conv, some > other email such as > https://www.mail-archive.com/gem5-users@gem5.org/msg20468.html > https://www.mail-archive.com/gem5-users@gem5.org/msg20456.html > also met this problem. > > I have also tried to use the latest docker and the original command like > this: > > docker run --rm -v ${PWD}:${PWD} -v > ${PWD}/gem5-resources/src/gpu/DNNMark/cachefiles:/root/.cache/miopen/2.9.0 -w > ${PWD} gcr.io/gem5-test/gcn-gpu:v22-1 gem5/build/GCN3_X86/gem5.opt > gem5/configs/example/apu_se.py -n3 > --benchmark-root=gem5-resources/src/gpu/DNNMark/build/benchmarks/test_fwd_conv > -cdnnmark_test_fwd_conv --options="-config > gem5-resources/src/gpu/DNNMark/config_example/conv_config.dnnmark -mmap > gem5-resources/src/gpu/DNNMark/mmap.bin" > > but still got the same error. > > Is that means conv layers is currently no available for gem5-gcn? > > May i ask is there anyone else met this problem before? > > thank you! > > > ------------------ 原始邮件 ------------------ > *发件人:* "Matt Sinclair" <mattdsinclair.w...@gmail.com>; > *发送时间:* 2023年5月12日(星期五) 中午11:42 > *收件人:* "The gem5 Developer List"<gem5-dev@gem5.org>; > *抄送:* "gem5-users"<gem5-us...@gem5.org>;"429442672"<429442...@qq.com>; > *主题:* Re: [gem5-dev] GEM5-GCN-DNNMark get Invalid filter channel number > when running: dnnmark_test_VGG, dnnmark_test_alexnet, dnnmark_test_fwd_conv > > I have not tried running these specific benchmarks in gem5 personally, so > I cannot say for certain what the error is or even if they are expected to > run to completion in gem5. But, normally the error you're seeing happens > because you have not created the appropriate "cache" files for the GPU > kernel(s) the program is trying to run. MIOpen first checks to see if the > desired kernel has been run on your machine before, and if not it tries to > do online compilation of that kernel. Unfortunately online compilation of > kernels in gem5 is a) very slow and b), because it is very slow, not > supported in gem5 (basically, it is so slow as to not be worth supporting > in many cases, in my opinion). So, instead, the expectation is that we > build the kernels we want ahead of time, before running the program. You > may have seen in the examples we provide for DNNMark ( > https://resources.gem5.org/resources/dnn-mark) that we have this > "generate cachefiles" script -- that is exactly what the purpose of that > script is. Moreover, on the same webpage, you may have noticed we include > the path to that cache directory in our docker commands (emphasis mine): > > docker run --rm -v ${PWD}:${PWD} -v > *${PWD}/gem5-resources/src/gpu/DNNMark/cachefiles:/root/.cache/miopen/2.9.0* > -w ${PWD} gcr.io/gem5-test/gcn-gpu:v22-1 gem5/build/GCN3_X86/gem5.opt > gem5/configs/example/apu_se.py -n3 > --benchmark-root=gem5-resources/src/gpu/DNNMark/build/benchmarks/test_fwd_softmax > -cdnnmark_test_fwd_softmax --options="-config > gem5-resources/src/gpu/DNNMark/config_example/softmax_config.dnnmark -mmap > gem5-resources/src/gpu/DNNMark/mmap.bin" > > From looking at your commands, I don't see you including this. Thus, > while I do not know if that script by default produces the kernels needed > for, say, AlexNet, I strongly suspect you should start by running that > script and updating your docker commands to include the cache stuff ... > then see what happens from there. > > Sidenote: normally when I see this: > > MIOpen(HIP): Warning [ParseAndLoadDb] File is unreadable: > /opt/rocm-4.0.1/miopen/share/miopen/db/gfx801_4.HIP.fdb.txt > > It means that the files MIOpen is expecting are not setup properly. > Normally I just symlink these extra files -- e.g., symlink gfx801_4... from > gfx801_32 ... (this is not the best performing, option because the > resources are different, but provides a basic setup step to avoid problems > like this). > Matt > > On Thu, May 11, 2023 at 10:02 PM 429442672 via gem5-dev <gem5-dev@gem5.org> > wrote: > >> Dear Matt, >> When i run the benchmarks: dnnmark_test_VGG, dnnmark_test_alexnet, >> dnnmark_test_fwd_conv, and i got the same error like this (get Invalid >> filter channel number): >> >> build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall >> fdatasync(...) >> build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall >> fdatasync(...) >> build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall >> fdatasync(...) >> build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall >> fdatasync(...) >> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6 >> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6 >> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6 >> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6 >> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6 >> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6 >> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6 >> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6 >> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6 >> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6 >> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6 >> MIOpen(HIP): Warning [ParseAndLoadDb] File is unreadable: >> /opt/rocm-4.0.1/miopen/share/miopen/db/gfx801_4.HIP.fdb.txt >> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6 >> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6 >> MIOpen Error: /root/driver/MLOpen/src/ocl/convolutionocl.cpp:150: Invalid >> filter channel number >> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6 >> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6 >> MIOpen Error: /root/driver/MLOpen/src/ocl/convolutionocl.cpp:150: Invalid >> filter channel number >> MIOpen Error: 3 at >> /home/tang/gem5-resources/src/gpu/DNNMark/core/include/dnn_utility.h1057Ticks: >> 327510244000 >> Exiting because exiting with last active thread context >> >> >> My command line is: >> >> gem5/build/GCN3_X86/gem5.opt gem5/configs/example/apu_se.py >> -n 8 --mem-size=12GB >> --benchmark-root=gem5-resources/src/gpu/DNNMark/build/benchmarks/test_fwd_conv >> -c dnnmark_test_fwd_conv >> --options="-config >> gem5-resources/src/gpu/DNNMark/config_example/conv_config.dnnmark -mmap >> gem5-resources/src/gpu/DNNMark/mmap.bin" >> >> and i didn't change the setup of the default setup of conv_config.dnnmark >> and gfx801 >> >> May i ask, did i do something wrong here? >> Have you ever test those benchmark without error, and could you please >> show me several your configurations? >> >> Thank you so much! >> >> >> >> _______________________________________________ >> gem5-dev mailing list -- gem5-dev@gem5.org >> To unsubscribe send an email to gem5-dev-le...@gem5.org >> >
_______________________________________________ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org