Thanks, this is helpful.  Looking through those old email chains, I don't
see any specific resolution to them, unfortunately.  I do not have a ton of
time to dig into this (end of the semester is keeping me busy) but if you
can keep digging I may be able to provide some ideas.

First, are you still seeing this error:

MIOpen(HIP): Warning [ParseAndLoadDb] File is unreadable:
/opt/rocm-4.0.1/miopen/share/miopen/db/gfx801_4.HIP.fdb.txt

after making the above change to include the MIOpen cache?

Second, what layer size are you assuming/trying to conv?  The failure
shortly after is here:
https://github.com/ROCmSoftwarePlatform/MIOpen/blob/rocm-4.0.1-release/src/ocl/convolutionocl.cpp#L149.
It seems to imply that X and W are not equal, but we'd need to dig to
figure out if this is because of the config file being passed in, or
something in MIOpen/gem5 that is breaking it.  Given that the function call
that is failing is trying to create a tensor:
https://github.com/shidong-ai/DNNMark/blob/develop/core/include/dnn_utility.h#L106,
my guess is that it's something with the config, because something so basic
probably (hopefully?) doesn't fail in MIOpen...

In terms of if it should work or not, I don't see that we included it in
prior papers (e.g.,
https://www.gem5.org/assets/files/papers/enabling2021ispass.pdf) but I
don't know if that was because we didn't try or if there was a deeper, more
fundamental reason).

Matt

On Sat, May 13, 2023 at 10:34 AM 429442672 <429442...@qq.com> wrote:

> Thank you so much for advice.
>
> Acturally, i have made the cachefiles as shown in the figure.
>
> Besides, i have succesfully run several benchmarks such as pool,
> activations, softmax, so i think the kernels is setuped.
> The reason of the differences in commands is that i directly run the
> command in the docker container.
>
> There might be a common problem when running the network with conv, some
> other email such as
> https://www.mail-archive.com/gem5-users@gem5.org/msg20468.html
> https://www.mail-archive.com/gem5-users@gem5.org/msg20456.html
> also met this problem.
>
> I have also tried to use the latest docker and the original command like
> this:
>
> docker run --rm -v ${PWD}:${PWD} -v 
> ${PWD}/gem5-resources/src/gpu/DNNMark/cachefiles:/root/.cache/miopen/2.9.0 -w 
> ${PWD} gcr.io/gem5-test/gcn-gpu:v22-1 gem5/build/GCN3_X86/gem5.opt 
> gem5/configs/example/apu_se.py -n3 
> --benchmark-root=gem5-resources/src/gpu/DNNMark/build/benchmarks/test_fwd_conv
>  -cdnnmark_test_fwd_conv --options="-config 
> gem5-resources/src/gpu/DNNMark/config_example/conv_config.dnnmark -mmap 
> gem5-resources/src/gpu/DNNMark/mmap.bin"
>
> but still got the same error.
>
> Is that means conv layers is currently no available for gem5-gcn?
>
> May i ask is there anyone else met this problem before?
>
> thank you!
>
>
> ------------------ 原始邮件 ------------------
> *发件人:* "Matt Sinclair" <mattdsinclair.w...@gmail.com>;
> *发送时间:* 2023年5月12日(星期五) 中午11:42
> *收件人:* "The gem5 Developer List"<gem5-dev@gem5.org>;
> *抄送:* "gem5-users"<gem5-us...@gem5.org>;"429442672"<429442...@qq.com>;
> *主题:* Re: [gem5-dev] GEM5-GCN-DNNMark get Invalid filter channel number
> when running: dnnmark_test_VGG, dnnmark_test_alexnet, dnnmark_test_fwd_conv
>
> I have not tried running these specific benchmarks in gem5 personally, so
> I cannot say for certain what the error is or even if they are expected to
> run to completion in gem5.  But, normally the error you're seeing happens
> because you have not created the appropriate "cache" files for the GPU
> kernel(s) the program is trying to run.  MIOpen first checks to see if the
> desired kernel has been run on your machine before, and if not it tries to
> do online compilation of that kernel.  Unfortunately online compilation of
> kernels in gem5 is a) very slow and b), because it is very slow, not
> supported in gem5 (basically, it is so slow as to not be worth supporting
> in many cases, in my opinion).  So, instead, the expectation is that we
> build the kernels we want ahead of time, before running the program.  You
> may have seen in the examples we provide for DNNMark (
> https://resources.gem5.org/resources/dnn-mark) that we have this
> "generate cachefiles" script -- that is exactly what the purpose of that
> script is.  Moreover, on the same webpage, you may have noticed we include
> the path to that cache directory in our docker commands (emphasis mine):
>
> docker run --rm -v ${PWD}:${PWD} -v 
> *${PWD}/gem5-resources/src/gpu/DNNMark/cachefiles:/root/.cache/miopen/2.9.0* 
> -w ${PWD} gcr.io/gem5-test/gcn-gpu:v22-1 gem5/build/GCN3_X86/gem5.opt 
> gem5/configs/example/apu_se.py -n3 
> --benchmark-root=gem5-resources/src/gpu/DNNMark/build/benchmarks/test_fwd_softmax
>  -cdnnmark_test_fwd_softmax --options="-config 
> gem5-resources/src/gpu/DNNMark/config_example/softmax_config.dnnmark -mmap 
> gem5-resources/src/gpu/DNNMark/mmap.bin"
>
> From looking at your commands, I don't see you including this.  Thus,
> while I do not know if that script by default produces the kernels needed
> for, say, AlexNet, I strongly suspect you should start by running that
> script and updating your docker commands to include the cache stuff ...
> then see what happens from there.
>
> Sidenote: normally when I see this:
>
> MIOpen(HIP): Warning [ParseAndLoadDb] File is unreadable:
> /opt/rocm-4.0.1/miopen/share/miopen/db/gfx801_4.HIP.fdb.txt
>
> It means that the files MIOpen is expecting are not setup properly.
> Normally I just symlink these extra files -- e.g., symlink gfx801_4... from
> gfx801_32 ... (this is not the best performing, option because the
> resources are different, but provides a basic setup step to avoid problems
> like this).
> Matt
>
> On Thu, May 11, 2023 at 10:02 PM 429442672 via gem5-dev <gem5-dev@gem5.org>
> wrote:
>
>> Dear Matt,
>>      When i run the benchmarks: dnnmark_test_VGG, dnnmark_test_alexnet,
>> dnnmark_test_fwd_conv, and i got the same error like this (get Invalid
>> filter channel number):
>>
>> build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall
>> fdatasync(...)
>> build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall
>> fdatasync(...)
>> build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall
>> fdatasync(...)
>> build/GCN3_X86/sim/syscall_emul.cc:74: warn: ignoring syscall
>> fdatasync(...)
>> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
>> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
>> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
>> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
>> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
>> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
>> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
>> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
>> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
>> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
>> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
>> MIOpen(HIP): Warning [ParseAndLoadDb] File is unreadable:
>> /opt/rocm-4.0.1/miopen/share/miopen/db/gfx801_4.HIP.fdb.txt
>> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
>> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
>> MIOpen Error: /root/driver/MLOpen/src/ocl/convolutionocl.cpp:150: Invalid
>> filter channel number
>> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
>> build/GCN3_X86/sim/syscall_emul.cc:683: warn: fcntl: unsupported command 6
>> MIOpen Error: /root/driver/MLOpen/src/ocl/convolutionocl.cpp:150: Invalid
>> filter channel number
>> MIOpen Error: 3 at
>> /home/tang/gem5-resources/src/gpu/DNNMark/core/include/dnn_utility.h1057Ticks:
>> 327510244000
>> Exiting because exiting with last active thread context
>>
>>
>> My command line is:
>>
>> gem5/build/GCN3_X86/gem5.opt gem5/configs/example/apu_se.py
>> -n 8 --mem-size=12GB
>> --benchmark-root=gem5-resources/src/gpu/DNNMark/build/benchmarks/test_fwd_conv
>> -c dnnmark_test_fwd_conv
>> --options="-config
>> gem5-resources/src/gpu/DNNMark/config_example/conv_config.dnnmark -mmap
>> gem5-resources/src/gpu/DNNMark/mmap.bin"
>>
>> and i didn't change the setup of the default setup of conv_config.dnnmark
>> and gfx801
>>
>> May i ask, did i do something wrong here?
>> Have you ever test those benchmark without error, and could you please
>> show me several your configurations?
>>
>> Thank you so much!
>>
>>
>>
>> _______________________________________________
>> gem5-dev mailing list -- gem5-dev@gem5.org
>> To unsubscribe send an email to gem5-dev-le...@gem5.org
>>
>
_______________________________________________
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org

Reply via email to