[gem5-users] Re: [EXT] Re: Success with GPU

2025-03-02 Thread Matt Sinclair via gem5-users
Whoops, thanks. I meant this one: https://github.com/gem5bootcamp/2024/tree/main/materials/04-GPU-model. Note in these commands we actually specified the full path to gem5-vega in them (we also didn't need the docker in this case because of how Jason set things up). I do not know though if that s

[gem5-users] Re: [EXT] Re: Success with GPU

2025-03-02 Thread Matt Sinclair via gem5-users
Hi Nick, In regards to gem5-vega-se, the ls you sent before shows it does not exist in /usr/local/bin, so it's not surprising that didn't work. In regards to docker more broadly though, setting up volumes is tricky. The mental model I always have is: with this volume I specify, is everything I ne

[gem5-users] Re: [EXT] Re: Success with GPU

2025-03-02 Thread Matt Sinclair via gem5-users
Hi Nick, Did you try setting up the docker volume to explicitly include /usr/local/bin then? The reason (likely) why it worked after you compiled gem5.opt explicitly is that the folder you built it in was explicitly included in the docker volume. Unless you are saying you built gem5.opt for VEGA

[gem5-users] Re: Success with GPU

2025-03-02 Thread Matt Sinclair via gem5-users
Hi Nick, I'm not sure why you believe gem5-vega should be in /usr/local/bin? It's been a few months since I last looked at this codespace, but looking at the instructions here: https://github.com/gem5bootcamp/gem5-bootcamp-env/blob/51590ae00b0e451c9b6a8854addbb94128ab4cac/materials/developing-gem5

[gem5-users] Re: [EXT] Re: Problem running gpu example with codespace

2025-03-01 Thread Matt Sinclair via gem5-users
Sorry, hit send too soon. The other thing to note from the error you sent is that gem5-vega is not found in the docker. Typically this happens when the docker working directory and/or volume is not able to find the file. I am not sure what directory you are running your command from, but this wo

[gem5-users] Re: [EXT] Re: Problem running gpu example with codespace

2025-03-01 Thread Matt Sinclair via gem5-users
Jason will have to chime in on setting up the codespace, as I am not familiar with that. But regarding the docker, did you try running "docker pull ghcr.io/gem5/gcn-gpu:v24-0" (or "docker pull ghcr.io/gem5/gcn-gpu:latest") first? I don't recall if Jason had the docker pre-downloaded and thus this

[gem5-users] Re: Problem running gpu example with codespace

2025-02-28 Thread Matt Sinclair via gem5-users
Hi Nicholas, Sorry for my delayed response, but very cool to hear you are using these. Jason (CC'd) might need to help about codespace specific issues though. Before we get there though, there are some things to try. Ultimately, the problem is that it can't find the libamdhip library. I'll need

[gem5-users] Re: [EXT] Re: Does GarnetPt2Pt and GarnetMesh still work in Gem5?

2024-10-28 Thread Matt Sinclair via gem5-users
I just have them run the code on the Linux machines in my department's computing labs -- no docker, no GitHub codespace, etc. I have considered codespaces, but so far have not done so. I don't expect GitHub will reply to you -- everyone I know who has gone that route has set up their own codespac

[gem5-users] Re: [EXT] Re: Does GarnetPt2Pt and GarnetMesh still work in Gem5?

2024-10-28 Thread Matt Sinclair via gem5-users
Hi Nicholas, +1 to what Jason said. I just assigned the students in my class an assignment using many of the different replacement policies and none reported problems: https://pages.cs.wisc.edu/~sinclair/courses/cs752/fall2024/handouts/cs752-fall2024-hw5.pdf My guess is you are thinking of my st

[gem5-users] Re: Question about running GPU emulation in gem5

2024-10-05 Thread Matt Sinclair via gem5-users
Hi Nicholas, Really glad to hear these GPU tests are useful for your class! I am not in front of a terminal, so I can't confirm every single thing, but here is what I think is happening: - You mention there are 2 sets of stats. This is potentially because a recent commit (https://github.com/gem

[gem5-users] Re: Gem5 gpu

2024-09-12 Thread Matt Sinclair via gem5-users
Hi Ravikant, If I understand your request, you are trying to run multiple processes (one on CPU, one on CPU+GPU) simultaneously. I have never tried doing this, and thus do not know how to make it work or not. My guess though is that gem5 does not support running multiple concurrent processes. S

[gem5-users] Re: Gem5 gpu

2024-08-09 Thread Matt Sinclair via gem5-users
Hi Ravikant, >From looking at the details below, it appears you are using the GPUSE gem5 support. In this version, I don’t believe we ever officially got AlexNet or VGG working. For fwd_conv there was a prior message on this mailing list about some of the issues with it, but I’m having a hard ti

[gem5-users] Re: RENAME: HELP Needed for Running Benchmarks in GPU Full System Simulation

2024-01-02 Thread Matt Sinclair via gem5-users
Just to add to this: to the best of my knowledge online compilation in OpenCL is not supported in gem5 outside of KVM (which does that compilation on the real CPU). I don't think it just increases simulation time -- I think it just throws an error. Matt On Tue, Jan 2, 2024 at 1:37 PM Poremba, Ma

[gem5-users] Re: Fail to run gpu-fs

2023-12-19 Thread Matt Sinclair via gem5-users
Hi Sandy, Can you please give us a bit more information about what you were running? It looks like you were just trying to run square from the README? Normally that works out of the box, so I'm wondering if you made any changes to your local setup. (I am not the primary developer for GPUFS, but

[gem5-users] Re: Error in an application running on gem5 GCN3 (with apu_se.py)

2023-10-19 Thread Matt Sinclair via gem5-users
Hi Anoop, 1. gfx902 warning: this is "intentionally" there on the ROCm compiler folks side. Essentially, they are trying to warn you that APUs are not 100% optimized for in ROCm. In particular, I believe libraries like MIOpen do not have APU support. But as long as your code does not use libra

[gem5-users] Re: Error in an application running on gem5 GCN3 (with apu_se.py)

2023-09-11 Thread Matt Sinclair via gem5-users
Yeah, I haven't tried CHAI but I believe gfx902 would work with it (if you need APUs). Matt S. On Mon, Sep 11, 2023 at 12:56 PM Poremba, Matthew wrote: > [Public] > > Hi Anoop, > > > > > > That instruction was recently added to gem5, but for Vega ISA only: > https://gem5-review.googlesource.com

[gem5-users] Re: Gem5 GCN3_X86

2023-08-23 Thread Matt Sinclair via gem5-users
Hi Kazi, Trying to answer your questions: 1. I am not aware of -d not working -- as of yesterday my students and I were able to use it (with head of develop, or something close to it). How are you attempting to use it on the command line? 2. I am not sure about the -mem-type flag (maybe Matt

[gem5-users] Re: Error in an application running on gem5 GCN3 (with apu_se.py)

2023-08-17 Thread Matt Sinclair via gem5-users
Hi Anoop, I'm glad that increasing -n helped. It's hard to say what exactly the problem is without digging in further, but often the ROCm stack will launch additional processes to do a variety of things (e.g., check which version of LLVM is being used). In gem5, each of these require a separate

[gem5-users] Re: Error in an application running on gem5 GCN3 (with apu_se.py)

2023-08-16 Thread Matt Sinclair via gem5-users
Hi Anoop, A few things here: - Regarding the original failure (at least the !FS part), this is normally happening either because of the GPU Target ISA (e.g., gfx900) you used in your Makefile (e.g., it is not supported) or because you didn't properly specify what GPU ISA you are using when runnin

[gem5-users] Re: gem5 VEGA_X86 simulation with GPU support

2023-07-23 Thread Matt Sinclair via gem5-users
Hi Lin, I don't see anything obviously wrong with your command, but this error seems to imply that something with the setup of the GPU device is wrong. If you didn't change anything though, then probably there is something wrong with our GPUFS instructions. Matt P (CC'd) knows the GPUFS code much

[gem5-users] Re: Exception when running libtorch simulation in SE mode

2023-07-18 Thread Matt Sinclair via gem5-users
For what it's worth, one of the students working with me (Marco, CC'd) is having the same failure right now for the head of develop (plus this fix: https://github.com/gem5/gem5/pull/99), except for a tiny GPU microbenchmark that definitely is not using PyTorch or any higher level library. We are w

[gem5-users] Re: Replacing CPU model in GPU-FS

2023-07-05 Thread Matt Sinclair via gem5-users
Answers: 1. Yes, I believe so. However, I have never personally tried using the O3 model with the GPU. Matt P has, I believe, so he may have better feedback there. 2. I have not followed the chain of events all the way through here, but I *believe* that the builtin you highlighted is used at

[gem5-users] Re: Replacing CPU model in GPU-FS

2023-06-30 Thread Matt Sinclair via gem5-users
Just to follow-up on 4 and 5: 4. The synchronization should happen at the directory-level here, since this is the first level of the memory system where both the CPU and GPU are connected. However, I have not tested if the programmer sets the GLC bit (which should perform the atomic at the GPU's

[gem5-users] Re: GPU-FS simulation progress

2023-06-23 Thread Matt Sinclair via gem5-users
Maybe I'm missing something, but where in that set of prints is the error? At the end I see this: Exiting @ tick 2581705103 because m5_exit instruction encountered Which is the normal thing to see when gem5 exists. Matt On Fri, Jun 23, 2023 at 4:06 AM Anoop Mysore via gem5-users < gem5-user

[gem5-users] Re: bad ioctl error in gpu_comput_driver.cc

2023-06-20 Thread Matt Sinclair via gem5-users
Right, the error you got with HeteroSync is because the generation of GPU the Makefile compiled for (gfxXXX) was not the same as the version the simulation supported. Since you were using GCN3 you would need to compile for gfx801 (APU) or gfx803 (dGPU) depending on what if you are trying to run a

[gem5-users] Re: How to change voltage and frequency of individual ComputeUnit for GPU?

2023-04-03 Thread Matt Sinclair via gem5-users
DVFS is not my area of expertise, so I'm not sure I can offer much useful feedback here. My guess is you probably meant line 429 of what I see on develop: https://gem5.googlesource.com/public/gem5/+/refs/heads/develop/configs/example/apu_se.py#429? Regardless, without more information it's har

[gem5-users] Re: How to change voltage and frequency of individual ComputeUnit for GPU?

2023-04-03 Thread Matt Sinclair via gem5-users
Hi Kazi, Srikant (CC'd) previously added some support for things like this (https://gem5-review.googlesource.com/c/public/gem5/+/61589), but in the L1/L2 caches instead of the CUs specifically. From a cursory check I don't see this support directly integrated into the CUs, but since they are C

[gem5-users] Re: gem5-gcn(VEGA) related issues

2023-03-07 Thread Matt Sinclair via gem5-users
I have personally never tried gfx906 but in theory it should work. You would have to change the config files to allow gfx906 as a valid option ( https://gem5.googlesource.com/public/gem5/+/refs/heads/develop/configs/example/apu_se.py#941) and then see what happens. Regarding the assembly error, I

[gem5-users] Re: Error when running test_bwd_bn test with gem5 GCN3 GPU

2023-03-05 Thread Matt Sinclair via gem5-users
Can you please provide more information about what the problem is? The error message you posted is lacking context. Specifically what input size were you trying to use? And did you generate the appropriate cachefiles before running, as mentioned here: http://resources.gem5.org/resources/dnn-mark?

[gem5-users] Re: Unavailability of GPU_RfO and GPU_VIPER_Region protocol in gem5 v21

2023-02-08 Thread Matt Sinclair via gem5-users
Tl;dr: while you can copy the code, I suspect it will be very painful to get these to work. This is the response I got in 2020 when asking a similar question: "I took a look at the code and I apologize for the confusion. I now realize we did not make it clear that we deprecated those protocols

[gem5-users] Re: 回复:Re: 回复:Re: Gem5 GCN3 (GPUCoalescer detected deadlock when running pagerank.)

2022-11-07 Thread Matt Sinclair via gem5-users
Thanks Matt P, I hadn’t gotten a chance to try reverting that patch. I agree reverting it and running SE mode or using FS mode is the simplest solution in the meantime. In terms of the deadlock: I think it’s just that many ticks because the threshold for deadlocks is very long/big. I wasn’t r

[gem5-users] Re: 回复:Re: Gem5 GCN3 (GPUCoalescer detected deadlock when running pagerank.)

2022-11-06 Thread Matt Sinclair via gem5-users
Thanks, this is helpful. Regarding the trace: if this is the failure on develop, then I don’t think you need to get a trace, as the failure is different here. But yes, ProtocolTrace would be the flag to use for this. Regarding PageRank, I am running just the PageRank SPMV variant from the week

[gem5-users] Re: Gem5 GCN3 (GPUCoalescer detected deadlock when running pagerank.)

2022-11-05 Thread Matt Sinclair via gem5-users
Can you please try the develop branch as well? While this is good to know it doesn’t pass on stable, if develop solves already then that is good to know. Matt Sent from my iPhone On Nov 5, 2022, at 10:51 PM, 1575883782 via gem5-users wrote:  Thanks. I will try to use `--reg-alloc-policy=dy

[gem5-users] Re: Gem5 GCN3 (GPUCoalescer detected deadlock when running pagerank.)

2022-11-05 Thread Matt Sinclair via gem5-users
Hi, Ultimately this message is telling you there is a deadlock in the cache coherence protocol when running PageRank with the specifications you did. To fix it, you would need to get a trace (https://www.gem5.org/documentation/learning_gem5/part3/MSIdebugging/) and look through to see what th

[gem5-users] Re: Error when running test_bwd_bn test

2022-04-12 Thread Matt Sinclair via gem5-users
In general, yes, MIOpen is less optimized for APUs. I do not recall seeing this before for bwd_bn though. @Kyle Roarty: have you seen this? I'm wondering if something is missing with how we set HIP_PLATFORM in the docker? I did some quick digging and it appears to be

[gem5-users] Re: Error when running test_bwd_bn test

2022-04-11 Thread Matt Sinclair via gem5-users
Hi David, My guess is you are using gfx801 for this? If so, does the application actually error out at this point, or just proceed beyond it? If it's the latter, my guess is MIOpen is just complaining that you're running with an APU, which is less well optimized for. If it's the former, then

[gem5-users] Re: cpu and gpu in gcn3_x86 execute different test programs

2022-04-10 Thread Matt Sinclair via gem5-users
There is a failure you are hitting: /HIP/rocclr/hip_global.cpp:69: guarantee(false && "Cannot find Symbol") ___ How are you compiling your code? Matt -Original Message- From: 17861509600--- via gem5-users Sent: Sunday, April 10, 2022 9:03 P

[gem5-users] Re: cpu and gpu in gcn3_x86 execute different test programs

2022-04-10 Thread Matt Sinclair via gem5-users
Hi, I personally have never tried running CPU and GPU workloads simultaneously in gem5, so I don't have great answers here. But what exactly is happening now? What is the last output you are seeing when you run your workload? Thanks, Matt -Original Message- From: 17861509600--- via g

[gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn

2022-03-16 Thread Matt Sinclair via gem5-users
Matt P or Srikant: can you please help David with the latency question? You know the answers better than I do here. Matt From: David Fong Sent: Wednesday, March 16, 2022 5:47 PM To: Matt Sinclair ; gem5 users mailing list Cc: Kyle Roarty ; Poremba, Matthew Subject: RE: gem5 : X86 + GCN3 (gf

[gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn

2022-03-15 Thread Matt Sinclair via gem5-users
Hi David, The dynamic register allocation policy allows the GPU to schedule as many wavefronts as there is register space on a CU. By default, the original register allocator released with this GPU model ("simple") only allowed 1 wavefront per CU at a time because the publicly available depend

[gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn

2022-03-14 Thread Matt Sinclair via gem5-users
Hi David, I have not seen this mmap error before, and my initial guess was the mmap error is happening because you are trying to allocate more memory than we created when mmap'ing the inputs for the applications (we do this to speed up SE mode, because otherwise initializing arrays can take sev

[gem5-users] Re: gem5 : X86 + GCN3 (gfx8001) + test_fwd_conv

2022-03-11 Thread Matt Sinclair via gem5-users
.com>>; Kyle Roarty mailto:kroa...@wisc.edu>>; Matthew Poremba mailto:matthew.pore...@amd.com>> Subject: Re: [gem5-users] Re: gem5 : X86 + GCN3 (gfx8001) + test_fwd_conv Just to be clear: —mem-size is an input arg for the apu_se.py script. Matt Sent from my iPhone On Mar 10, 2022, at

[gem5-users] Re: gem5 : X86 + GCN3 (gfx8001) + test_fwd_conv

2022-03-11 Thread Matt Sinclair via gem5-users
ject: Re: [gem5-users] Re: gem5 : X86 + GCN3 (gfx8001) + test_fwd_conv Just to be clear: —mem-size is an input arg for the apu_se.py script. Matt Sent from my iPhone On Mar 10, 2022, at 7:44 PM, Matt Sinclair via gem5-users mailto:gem5-users@gem5.org>> wrote:  I am on my phone and thus

[gem5-users] Re: gem5 : X86 + GCN3 (gfx8001) + test_fwd_conv

2022-03-10 Thread Matt Sinclair via gem5-users
Just to be clear: —mem-size is an input arg for the apu_se.py script. Matt Sent from my iPhone On Mar 10, 2022, at 7:44 PM, Matt Sinclair via gem5-users wrote:  I am on my phone and thus cannot easily look at the line that failed at the moment, but my first step would be to increase the

[gem5-users] Re: gem5 : X86 + GCN3 (gfx8001) + test_fwd_conv

2022-03-10 Thread Matt Sinclair via gem5-users
I am on my phone and thus cannot easily look at the line that failed at the moment, but my first step would be to increase the size of the memory gem5 is assuming — try —mem-size=8GB or 16GB and let us know if that solves the problem. Matt Sent from my iPhone On Mar 10, 2022, at 5:12 PM, David

[gem5-users] Re: gem5 : X86 + APU (gfx801) with CUs128 error with DNNMark test_fwd_softmax

2022-03-09 Thread Matt Sinclair via gem5-users
Thanks Kyle! Should we add a patch to address this then? Matt From: Kyle Roarty Sent: Wednesday, March 9, 2022 5:06 PM To: David Fong ; Matt Sinclair ; gem5 users mailing list ; Poremba, Matthew Subject: Re: gem5 : X86 + APU (gfx801) with CUs128 error with DNN

[gem5-users] Re: gem5 : X86 + APU (gfx801) with CUs128 error with DNNMark test_fwd_softmax

2022-03-09 Thread Matt Sinclair via gem5-users
@Kyle Roarty: I believe the only way to check that the number was substituted in is to watch the terminal when it's run, is that right? I am not aware of 128 CUs not being supported, but I also haven't tried that many before either. Matt From: David Fong Sent: Wednesd

[gem5-users] Re: gem5 : X86 + APU (gfx801) with CUs128 error with DNNMark test_fwd_softmax

2022-03-09 Thread Matt Sinclair via gem5-users
That error in #2 means MIOpen can't find the kernel again. Did you change the number of CUs to 128 (or whatever number of CUs you are using) when you generated the cachefiles? Matt From: David Fong via gem5-users Sent: Wednesday, March 9, 2022 12:50 PM To: Poremba, Matthew ; gem5 users mailin

[gem5-users] Re: gem5 : x86 + VEGA DGPU (gfx900) with test_fwd_conv error

2022-03-07 Thread Matt Sinclair via gem5-users
Kyle can you please take a look at this? Seems fwd_conv is broken with Vega from my reading of the output (which I was not aware of). But since we aren't testing Vega yet, it's perhaps not surprising something broke. David, in the meantime (if possible for your work) I would encourage you to u

[gem5-users] Re: gem5 + APU latency numbers

2022-03-07 Thread Matt Sinclair via gem5-users
I think Srikant's other reply addressed this? Matt From: David Fong Sent: Monday, March 7, 2022 11:12 AM To: Poremba, Matthew ; David Fong via gem5-users ; Bharadwaj, Srikant Cc: Bobby Bruce ; Matt Sinclair Subject: gem5 + APU latency numbers Hi Matt P., I

[gem5-users] Re: Gem5 GCN3 DNNMark benchmark error (fwd_softmax is ok, but others are not)

2022-02-12 Thread Matt Sinclair via gem5-users
iner plugin of VsCode. > > ---Original--- > *From:* "Matt Sinclair via gem5-users" > *Date:* Sat, Feb 12, 2022 01:41 AM > *To:* "gem5 users mailing list"; > *Cc:* "1575883782"<1575883...@qq.com>;"Kyle Roarty";"Matt > Sinclair"

[gem5-users] Re: Gem5 GCN3 DNNMark benchmark error (fwd_softmax is ok, but others are not)

2022-02-11 Thread Matt Sinclair via gem5-users
One more question for you, original poster: are you running DNNMark inside the docker resources we provided: http://resources.gem5.org/resources/dnn-mark? Or are you trying to get this running on your machine directly? Matt On Fri, Feb 11, 2022 at 11:37 AM Matt Sinclair wrote: > Kyle, can you

[gem5-users] Re: Gem5 GCN3 DNNMark benchmark error (fwd_softmax is ok, but others are not)

2022-02-11 Thread Matt Sinclair via gem5-users
Kyle, can you please help with this? I don't recall when we last tested bwd_act. Matt On Fri, Feb 11, 2022 at 2:18 AM 1575883782 via gem5-users < gem5-users@gem5.org> wrote: > Hi, > > I was trying to run DNNMark benchmark with its GCN3 GPU model following the > instructions > on http://resourc

[gem5-users] Re: FW: Gem5GCN3

2021-12-30 Thread Matt Sinclair via gem5-users
(Resending since message to mailing list bounced) Hi Atiye, When you have questions about gem5, please email the mailing list, instead of emailing anyone (e.g., me) directly. I am not always available to reply, nor do I know everything about gem5 – emailing the mailing list makes it more likel

[gem5-users] Re: Unrecognized register class when using the "Exec" debug flag

2021-12-01 Thread Matt Sinclair via gem5-users
Thanks Gabe. Good catch about the actual value -- I just saw a negative number and assumed -1, whoops. Based on what Nirmit is seeing, it seems like HINT_NOP or MOV_R_I must be the instruction causing the fault, but yeah a backtrace will probably help confirm. Nirmit, can you please try running

[gem5-users] Re: Unrecognized register class when using the "Exec" debug flag

2021-12-01 Thread Matt Sinclair via gem5-users
Hi Gabe, I was trying to dig through the RegClass code earlier to figure out why the value is -1 for this instruction, and the only thing that I can think of is HINT_NOP needs a RegClass value set for it, but it isn't set for some reason (which is not 100% clear to me). You know this code much be

[gem5-users] Re: Duplicate MessageBuffer creation in GPU_VIPER.py

2021-11-03 Thread Matt Sinclair via gem5-users
This certainly seems like a bug, but my guess is it's benign and will essentially overwrite the existing one. Brad/Matt P (CC'd) may know better though. Matt On Wed, Nov 3, 2021 at 4:06 AM Sampad Mohapatra via gem5-users < gem5-users@gem5.org> wrote: > Hi All, > > The dir_cntrl.requestToMemory

[gem5-users] Re: MOESI_AMD_Base-CorePair.sm and MOESI_AMD_Base-dir.sm Correctness Check

2021-10-23 Thread Matt Sinclair via gem5-users
Yes, I understood this is what you meant. The point I was trying to make is I have not examined a trace to see what is actually happening (have you gotten a trace to examine what's happening with the DataBlk value for this request?). After digging in a little further, it appears that this line: h

[gem5-users] Re: MOESI_AMD_Base-CorePair.sm and MOESI_AMD_Base-dir.sm Correctness Check

2021-10-23 Thread Matt Sinclair via gem5-users
(Resending to mailing list) Hi Sampad, There are lines directly below the one I pointed to that do potentially overwrite the data there. But I am not 100% sure -- Brad and Matt P, CC'd may know better or see something I'm missing. Matt On Sat, Oct 23, 2021 at 1:37 PM Sampad Mohapatra wrote:

[gem5-users] Re: MOESI_AMD_Base-CorePair.sm and MOESI_AMD_Base-dir.sm Correctness Check

2021-10-23 Thread Matt Sinclair via gem5-users
I am not sure I understand completely what you're getting at, but it appears the allocation of the TBE entry does store the data: https://gem5.googlesource.com/public/gem5/+/refs/heads/develop/src/mem/ruby/protocol/MOESI_AMD_Base-dir.sm#878 Matt On Thu, Oct 21, 2021 at 11:08 PM Sampad Mohapatra v

[gem5-users] Re: Access to gem5 101 course

2021-10-14 Thread Matt Sinclair via gem5-users
Hi all, I believe Jason messaged some of you individually, but we are in the process of hosting the gem5 101 "assignments" on the gem5.org website now. Hopefully more news on this soon. In the meantime, you are welcome to look at my course website, but keep in mind the 2020 versions were updated

[gem5-users] Re: GCN3 - Polybench GPU - SPEC 17 - Errors

2021-10-09 Thread Matt Sinclair via gem5-users
> Should I go ahead and make the VIPER_TCC changes ? > > Also, I will definitely try to submit the benchmarks if they work out. > > Regards, > Sampad > > On Sat, Oct 9, 2021 at 12:34 PM Matt Sinclair via gem5-users < > gem5-users@gem5.org> wrote: > >> Hi

[gem5-users] Re: GCN3 - Polybench GPU - SPEC 17 - Errors

2021-10-09 Thread Matt Sinclair via gem5-users
Hi Sampad, I have not seen anyone attempt to run workloads in a way you are attempting, so I can't offer every solution, but here are a few things I noticed: - Why are you still using ROCm 1.6.x? And why did you build it from source? I strongly recommend using the built-in docker support (which

[gem5-users] Re: gem5 GCN GPU docker error

2021-09-23 Thread Matt Sinclair via gem5-users
Patch to further update GCN3 webpage posted: https://gem5-review.googlesource.com/c/public/gem5-website/+/50907 Matt On Wed, Sep 22, 2021 at 2:10 PM Matt Sinclair wrote: > Thanks Kyle! I agree we should probably just update the documentation > Imad found to point to the gem5-resources document

[gem5-users] Re: gem5 GCN GPU docker error

2021-09-22 Thread Matt Sinclair via gem5-users
Thanks Kyle! I agree we should probably just update the documentation Imad found to point to the gem5-resources documentation -- since that was what we updated already. This is part of my plan for later -- that way the documentation doesn't go away, but we also don't need to update two different

[gem5-users] Re: gem5 GCN GPU docker error

2021-09-22 Thread Matt Sinclair via gem5-users
Collating responses to emails since you all type faster than me - Imad: glad to hear things work with the updates Matt P proposed! - documentation: Matt P, yes we did update the documentation here: https://resources.gem5.org/ (e.g., https://resources.gem5.org/resources/square), but apparently did

[gem5-users] Re: gem5 GCN GPU docker error

2021-09-22 Thread Matt Sinclair via gem5-users
(Resending since bounced the first time) Hi Imad, I just built the docker earlier this week and did not have any problems (e.g., I ran square and it completed in < 2 hours). How are you trying to build it? And how are you running the applications you mentioned? Thanks, Matt On Wed, Sep 22, 20

[gem5-users] Re: Some problems about GCN3_X86

2021-09-13 Thread Matt Sinclair via gem5-users
(Resending since bounced) Matt On Mon, Sep 13, 2021 at 1:22 PM Matt Sinclair wrote: > Rodinia is currently not part of the publicly available gem5-resources: > http://resources.gem5.org/. You are welcome to add support for them > though. It would be fairly straightforward to add them -- you w

[gem5-users] Re: Some problems about GCN3_X86

2021-09-12 Thread Matt Sinclair via gem5-users
If you don’t use the docker, then you will need to install the ROCm stack, yes. The gcn3_x86 build includes both CPUs and a GPU as is. So that is not a problem. And I believe it has private L1s and a shared L2. It does not by default partition the way you requested, but you are welcome to add t

[gem5-users] Re: Some problems about GCN3_X86

2021-09-12 Thread Matt Sinclair via gem5-users
The branch you mentioned is at least 2 years out of date. I recommend you use the current stable branch instead: https://gem5.googlesource.com/public/gem5/+/refs/heads/stable, which has all of the GPU support from the master-gcn3-staging branch (and many more) integrated into it. Moreover, to use

[gem5-users] Re: Some problems about GCN3_X86

2021-09-12 Thread Matt Sinclair via gem5-users
Hi, Can you please supply some additional information. For example, how are you trying to compile the GCN3 GPU version? And what branch/commit are you using? And are you using the docker we released that installs the GPU driver stack correctly, or are you trying to build without the docker? The

[gem5-users] Re: gem5 GCN3 GPU model docker build issue

2021-03-11 Thread Matt Sinclair via gem5-users
Follow-up: Commit that updates documentation here: https://gem5-review.googlesource.com/c/public/gem5-website/+/42803 Matt On Thu, Mar 11, 2021 at 11:01 AM Matt Sinclair wrote: > Thanks for pointing this out, we will update the documentation to be more > explicit. > > Matt > > On Wed, Mar 10,

[gem5-users] Re: gem5 GCN3 GPU model docker build issue

2021-03-11 Thread Matt Sinclair via gem5-users
Thanks for pointing this out, we will update the documentation to be more explicit. Matt On Wed, Mar 10, 2021 at 11:06 PM xpf via gem5-users wrote: > Hi, > > I didn't see the instructions say to use stable branch. I follow the > instructions on > http://www.gem5.org/documentation/general_docs/g

[gem5-users] Re: gem5 GCN3 GPU model docker build issue

2021-03-09 Thread Matt Sinclair via gem5-users
Right, like Matt said you should be using develop, not stable, for now. Did you see the instructions say to use stable somewhere? If so we can update that. Matt On Tue, Mar 9, 2021 at 9:42 AM Poremba, Matthew via gem5-users < gem5-users@gem5.org> wrote: > [AMD Public Use] > > Hi, > > > Develo

[gem5-users] Re: gem5 GCN3 GPU model docker build issue

2021-03-08 Thread Matt Sinclair via gem5-users
Hi, Can you tell us a bit more about your environment? For example, what branch are you using (stable or develop)? And what OS are you using? Thanks, Matt On Sun, Mar 7, 2021 at 11:11 PM xpf via gem5-users wrote: > Hi all, > > I follow the instructions on > http://www.gem5.org/documentation/

[gem5-users] Re: run caffe in Gem5

2021-01-21 Thread Matt Sinclair via gem5-users
Hi Javad, tl;dr: I don't believe anyone has publicly announced Caffe running end-to-end in gem5, but I have gotten hipCaffe to run for parts of applications in the past. A couple years ago I started working on getting hipCaffe (the HIP version of Caffe -- HIP is the current GPU programming langua

[gem5-users] Re: Magic instructions with GCN3 Model/hipcc return 0

2020-11-09 Thread Matt Sinclair via gem5-users
n value. There should be no reason to >> change the code which is calling the pseudo instruction to explicitly set >> RAX, especially if you're using the address based calling mechanism which >> doesn't go through that path at all. >> >> Gabe >> >>

[gem5-users] Re: Magic instructions with GCN3 Model/hipcc return 0

2020-11-09 Thread Matt Sinclair via gem5-users
Hi Dan, My comment was just a general comment on the m5ops -- I thought you were using the "old" format for building m5ops and that might have been the problem. Sounds like it wasn't. I think pushing a fix to develop and tagging Gabe and Jason as reviewers is probably the right strategy. Thanks

[gem5-users] Re: Magic instructions with GCN3 Model/hipcc return 0

2020-11-09 Thread Matt Sinclair via gem5-users
Hi Dan, In recent weeks, Gabe (if I recall correctly) updated how the m5ops are created. I had created a homework assignment for my course about it: https://pages.cs.wisc.edu/~sinclair/courses/cs752/fall2020/handouts/hw3.html (see #2), but this is now already out of date as the location of some f

[gem5-users] Re: gem5 GCN3 GPU model running issues

2020-11-06 Thread Matt Sinclair via gem5-users
Thanks Kyle! Another good reason for us to get the GCN3 tests up and running as part of kokoro soon :) Matt On Fri, Nov 6, 2020 at 6:35 PM Kyle Roarty wrote: > Hi all, > > Found the root cause. > https://gem5-review.googlesource.com/c/public/gem5/+/34160 moved all the > syscall tables to their

[gem5-users] Re: gem5 GCN3 GPU model running issues

2020-11-06 Thread Matt Sinclair via gem5-users
Ok, we’re using the same, but haven’t gotten the second error ... strange. Are you using different apps? Matt On Fri, Nov 6, 2020 at 4:45 PM Daniel Gerzhoy wrote: > I'm using the gcn3 docker, so Ubuntu 16.04 I believe > > On Fri, Nov 6, 2020 at 5:44 PM Matt Sinclair > wrote: > >> Hi Daniel & Y

[gem5-users] Re: gem5 GCN3 GPU model running issues

2020-11-06 Thread Matt Sinclair via gem5-users
Hi Daniel & Yichen, What OS are you using? We have not encountered either of these problems thus far ... something must be different about your setup and ours. Thanks, Matt On Fri, Nov 6, 2020 at 4:35 PM Daniel Gerzhoy via gem5-users < gem5-users@gem5.org> wrote: > For some reason that syscall

[gem5-users] Re: track the write syscall in the kernel

2020-10-24 Thread Matt Sinclair via gem5-users
Assuming you are asking about SE mode, I think this is what you are looking for: https://gem5.googlesource.com/public/gem5/+/refs/heads/develop/src/sim/syscall_emul.hh#2412 ? Matt On Sat, Oct 24, 2020 at 4:00 PM ABD ALRHMAN ABO ALKHEEL via gem5-users < gem5-users@gem5.org> wrote: > Hi All; > > I

[gem5-users] Re: Out of Memory while running GPU Benchmark

2020-09-14 Thread Matt Sinclair via gem5-users
I believe this error is happening because your simulated memory space (i.e., in the simulator) is not big enough for the application you are running. You mentioned that you were passing in 2GB. My guess is that you want mem_size here: https://gem5.googlesource.com/amd/gem5/+/refs/heads/agutierr/m

[gem5-users] Re: AMD GCN3 - X86KvmCPU usage - Segfault encountered

2020-09-07 Thread Matt Sinclair via gem5-users
Matt P (CC'd) will likely know better than me, but I don't believe KVM/fast-forwarding works with GCN3 yet. Matt On Mon, Sep 7, 2020 at 9:36 AM Sampad Mohapatra via gem5-users < gem5-users@gem5.org> wrote: > Hi All, > > I am using the staging branch GCN3. While using the KvmCPU to fast forward >

[gem5-users] Re: GCN3 docker file missing

2020-09-01 Thread Matt Sinclair via gem5-users
Hi Samaksh, The warnings you mentioned can be ignored. They are highlighting that the sched_yield syscall is not implemented in SE mode. Which is fine, it's not needed for correctly simulating this program on the GPU. I'm not quite sure what the other issue is though. It sounds like you are sa

[gem5-users] Re: GCN3 docker file missing

2020-09-01 Thread Matt Sinclair via gem5-users
This appears to be the same error Dan mentioned previously, where you are using gcc instead of hipcc. Did you try applying the fix he suggested there? Having said that, the first few lines appear to be making square already, so I'm not sure why you are trying to make it again? Matt On Tue, Sep

[gem5-users] Re: GCN3 docker file missing

2020-08-31 Thread Matt Sinclair via gem5-users
Hi Samaksh, Is this stuff you tried before or after the message Bobby sent? Thanks, Matt On Mon, Aug 31, 2020 at 2:09 PM Samaksh Sethi via gem5-users < gem5-users@gem5.org> wrote: > Ok so, doing this, on the first run, square.o is not created properly, but > that error goes away on the 2nd run,

[gem5-users] Re: GCN3 docker file missing

2020-08-31 Thread Matt Sinclair via gem5-users
Kyle, can you please fix this (the Makefile) for square? Or update the instructions in the way Dan described above? Matt On Mon, Aug 31, 2020 at 11:51 AM Daniel Gerzhoy via gem5-users < gem5-users@gem5.org> wrote: > Looks like that command needs to be updated, or the makefile. > > Try the comma

[gem5-users] Re: AMD GCN3 - Virtual network type correctness in MOESI_AMD_Base-dir.sm

2020-08-30 Thread Matt Sinclair via gem5-users
Hi Sampad, If possible, can you please submit a patch for this? That way Srikant and the others who are experts with Garnet can review and validate. Thanks, Matt On Sun, Aug 30, 2020 at 10:37 PM Sampad Mohapatra via gem5-users < gem5-users@gem5.org> wrote: > Hi Srikant, > > It is used to send

[gem5-users] Re: GCN3 docker file missing

2020-08-30 Thread Matt Sinclair via gem5-users
Dan or Kyle can confirm, but yes I believe that is what others are doing. If you look through the posted text from running square, you have a fatal error because of it being able to access gem5-resources. Kyle, have you seen this before? Matt On Sun, Aug 30, 2020 at 5:43 PM Samaksh Sethi via ge

[gem5-users] Re: GCN3 docker file missing

2020-08-30 Thread Matt Sinclair via gem5-users
Ok, we can try to make it clear that you should be looking at the develop branch. I thought the docker was pointing to the GCN3 staging branch, despite being on the develop branch, but that error is likely what this pending patch is fixing: https://gem5-review.googlesource.com/c/public/gem5/+/3365

[gem5-users] Re: GCN3 docker file missing

2020-08-30 Thread Matt Sinclair via gem5-users
Can you please provide us with some additional information about how you are attempting to run it? For example, what branch are you using? Looks like there is an extra 'o' in that link, thanks -- Kyle can you please fix this? To the best of our knowledge, the Docker is working, so I suspect ther

[gem5-users] Re: GCN3 - SLICC - GPU_VIPER-TCC.sm and GPU_TCP-TCP.sm Correctness

2020-08-25 Thread Matt Sinclair via gem5-users
Hi Sampad, I believe this relates to the fact that the coherence protocol is not actually sending out the data, but instead gem5 uses the backing store to functionally read/write data. Essentially, in a real system, yes we would need to send the data, which is why the message size accounts for th

[gem5-users] Re: GCN3/hip constant memory

2020-08-18 Thread Matt Sinclair via gem5-users
ually what I already have, and I wanted to keep the "constant" > data from getting evicted by other global memory data. > > How does the SQC work in terms of data rather than instructions? Could I > have data go in the SQC? > > On that note, where does "Shared" memor

[gem5-users] Re: Missing L1 and L2 Hit stats/actions in MOESI AMD Base - CorePair.sm

2020-08-18 Thread Matt Sinclair via gem5-users
_BASE-CorePair.sm if you >>>> haven't already done so. I can create a patch for you (or I'd be happy to >>>> review if you end up submitting one). >>>> >>>> I was confused about the L3Cache in the <...>-dir.sm file as well. >>

[gem5-users] Re: GCN3/hip constant memory

2020-08-18 Thread Matt Sinclair via gem5-users
Hi Dan, Tony will have to confirm, but I believe AMD didn’t add support for constant memory because none of the applications they looked at used it. The mincore error is kind of a catch all, saying that something bad happened and you went down a failure path. Assuming the above is correct, if you

[gem5-users] Re: Missing L1 and L2 Hit stats/actions in MOESI AMD Base - CorePair.sm

2020-08-17 Thread Matt Sinclair via gem5-users
hich > is a part of the Viper protocol. > > If the L3Cache_Controller isn't used, then why is it a part of the Viper > protocol ? > Does the L3 Cache not maintain any coherency ? > Is this the intended behaviour of the default configuration ? > > Thanks and Regards, > Sampad > &

[gem5-users] Re: Missing L1 and L2 Hit stats/actions in MOESI AMD Base - CorePair.sm

2020-08-13 Thread Matt Sinclair via gem5-users
Hi Sampad, I'm not aware of a patch for this. There was recently a patch to add similar support for the VIPER protocol: https://gem5-review.googlesource.com/c/public/gem5/+/30174. If the AMD folks (CC'd) don't have a patch, then the next best thing would be to do something similar to the VIPER p

[gem5-users] Re: AMD GCN3 - Can't use single CPU - fatal no spare thread context

2020-08-07 Thread Matt Sinclair via gem5-users
il&utm_term=icon> > Virus-free. > www.avast.com > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link> > <#m_8951653715473613158_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> > > On Sat,

  1   2   >