Hi,

My mistake, DMA are indeed handled coherently by the MESI_Two_Level protocol. I 
did encounter such issue with non-coherent DMA accesses in our own protocol. I 
wrongly assumed the same was happening here. I also overlooked the uncacheable 
parameter that happens to be override in the upstream ruby_mem_test.py. Your 
issue looked so much similar to one I’ve had that I did not double check these 
points.

So, DMA accesses should not fail in any case with the MESI_Two_Level protocol. 
Let’s start over again.

In MESI_Two_level-dir.sm, a DMA_WRITE event causes the 
qw_queueMemoryWBRequest_partial to be executed which sends a MEMORY_WB request 
to memory. The request is supposed to send:

1. the raw data block which is composed by the DMASequencer

   * The DMASequencer aligns the data block on a cacheline as expected by the 
ruby infrastructure (DMASequencer.cc:153)

2. the length of the access (always 1 byte for the memory tester)

3. the physical address of the access

   * The problem might be there as the variable *address* on line 
MESI_Two_Level-dir.sm:368 is set to the physical address of the access 
**aligned to a cacheline** (MESI_Two_Level-dir.sm:207)

The actual request that will then be sent to the memory controller is then 
produced by AbstractController::serviceMemoryQueue(). In that function, the 
data is retrieved in the block assuming the base address of the block is 
cacheline-aligned (AbstractController.cc:281). The result is that you always 
access the base address of a block when performing a DMA access with the 
MESI_Two_level protocol.

You can try the patch I’ve attached. It applies on v22.0.0.2. It will make 
MESI_Two_Level work as long as DMA accesses target a different memory region 
than CPU accesses (percent_uncacheable = 100).

Now, if you revert percent_uncacheable to 0, problems are back and this time 
for real. Namely, The directory model is not wired up to handle all possible 
data sharing states together with DMA accesses. Now, it’s you to decide whether 
you want to take the red pill or the blue pill ;) I’ll stick to the blue one 
and avoid diving into the rabbit hole that fixing a Ruby protocol often is. If 
it works for you and you luckily avoid the unsupported scenarios, then you 
should be good to go.

Regarding your comment

> This test was for MESI_Two_Level for which DMA should be working as it boots 
> up in Full System mode

It could very well be that every single DMA access performed by Linux happens 
to be cacheline-aligned for various good reasons. Booting Linux is not all that 
demanding on the coherency protocol. Its more about getting a few atomic 
accesses right and providing support for all the machinery going around the 
bare CPU core and memory (iterrupt controller(s), timers, MMU, file system 
storage back end, etc). Memory mapped register accesses and IOs are not very 
demanding in terms of coherency and it is very likely that performing 
non-coherent, AXI-like, accesses would work just fine.

ruby_mem_test.py is an actual torture test for the coherence protocol that 
surpasses anything an actual program could ever ask for.

Best,

Gabriel
diff --git a/configs/example/ruby_mem_test.py b/configs/example/ruby_mem_test.py
index b16b295f0f..442cb136df 100644
--- a/configs/example/ruby_mem_test.py
+++ b/configs/example/ruby_mem_test.py
@@ -99,7 +99,7 @@ system = System(cpu = cpus,
 if args.num_dmas > 0:
     dmas = [ MemTest(max_loads = args.maxloads,
                      percent_functional = 0,
-                     percent_uncacheable = 0,
+                     percent_uncacheable = 100,
                      progress_interval = args.progress,
                      suppress_func_errors =
                                         not args.suppress_func_errors) \
@@ -110,7 +110,7 @@ else:
 
 dma_ports = []
 for (i, dma) in enumerate(dmas):
-    dma_ports.append(dma.test)
+    dma_ports.append(dma.port)
 Ruby.create_system(args, False, system, dma_ports = dma_ports)
 
 # Create a top-level voltage domain and clock domain
diff --git a/src/cpu/testers/memtest/memtest.cc b/src/cpu/testers/memtest/memtest.cc
index 7c256d8642..5fefc7f899 100644
--- a/src/cpu/testers/memtest/memtest.cc
+++ b/src/cpu/testers/memtest/memtest.cc
@@ -220,7 +220,7 @@ MemTest::tick()
     // create a new request
     unsigned cmd = random_mt.random(0, 100);
     uint8_t data = random_mt.random<uint8_t>();
-    bool uncacheable = random_mt.random(0, 100) < percentUncacheable;
+    bool uncacheable = random_mt.random(1, 100) <= percentUncacheable;
     unsigned base = random_mt.random(0, 1);
     Request::Flags flags;
     Addr paddr;
diff --git a/src/mem/ruby/protocol/MESI_Two_Level-dir.sm b/src/mem/ruby/protocol/MESI_Two_Level-dir.sm
index 9d6975570c..53672b280f 100644
--- a/src/mem/ruby/protocol/MESI_Two_Level-dir.sm
+++ b/src/mem/ruby/protocol/MESI_Two_Level-dir.sm
@@ -234,10 +234,11 @@ machine(MachineType:Directory, "MESI Two Level directory protocol")
   in_port(memQueue_in, MemoryMsg, responseFromMemory, rank = 2) {
     if (memQueue_in.isReady(clockEdge())) {
       peek(memQueue_in, MemoryMsg) {
+        Addr lineAddr := makeLineAddress(in_msg.addr);
         if (in_msg.Type == MemoryRequestType:MEMORY_READ) {
-          trigger(Event:Memory_Data, in_msg.addr, TBEs[in_msg.addr]);
+          trigger(Event:Memory_Data, lineAddr, TBEs[lineAddr]);
         } else if (in_msg.Type == MemoryRequestType:MEMORY_WB) {
-          trigger(Event:Memory_Ack, in_msg.addr, TBEs[in_msg.addr]);
+          trigger(Event:Memory_Ack, lineAddr, TBEs[lineAddr]);
         } else {
           DPRINTF(RubySlicc, "%s\n", in_msg.Type);
           error("Invalid message");
@@ -352,7 +353,7 @@ machine(MachineType:Directory, "MESI Two Level directory protocol")
     peek(memQueue_in, MemoryMsg) {
       enqueue(responseNetwork_out, ResponseMsg, to_mem_ctrl_latency) {
         assert(is_valid(tbe));
-        out_msg.addr := address;
+        out_msg.addr := tbe.PhysicalAddress;
         out_msg.Type := CoherenceResponseType:DATA;
         out_msg.DataBlk := in_msg.DataBlk;   // we send the entire data block and rely on the dma controller to split it up if need be
         out_msg.Destination.add(tbe.Requestor);
@@ -365,7 +366,7 @@ machine(MachineType:Directory, "MESI Two Level directory protocol")
          desc="Queue off-chip writeback request") {
     peek(requestNetwork_in, RequestMsg) {
       enqueue(memQueue_out, MemoryMsg, to_mem_ctrl_latency) {
-        out_msg.addr := address;
+        out_msg.addr := tbe.PhysicalAddress;
         out_msg.Type := MemoryRequestType:MEMORY_WB;
         out_msg.Sender := machineID;
         out_msg.MessageSize := MessageSizeType:Writeback_Data;
@@ -378,7 +379,7 @@ machine(MachineType:Directory, "MESI Two Level directory protocol")
   action(da_sendDMAAck, "da", desc="Send Ack to DMA controller") {
       enqueue(responseNetwork_out, ResponseMsg, to_mem_ctrl_latency) {
         assert(is_valid(tbe));
-        out_msg.addr := address;
+        out_msg.addr := tbe.PhysicalAddress;
         out_msg.Type := CoherenceResponseType:ACK;
         out_msg.Destination.add(tbe.Requestor);
         out_msg.MessageSize := MessageSizeType:Writeback_Control;
@@ -410,7 +411,7 @@ machine(MachineType:Directory, "MESI Two Level directory protocol")
     peek(responseNetwork_in, ResponseMsg) {
       enqueue(responseNetwork_out, ResponseMsg, to_mem_ctrl_latency) {
         assert(is_valid(tbe));
-        out_msg.addr := address;
+        out_msg.addr := tbe.PhysicalAddress;
         out_msg.Type := CoherenceResponseType:DATA;
         out_msg.DataBlk := in_msg.DataBlk;   // we send the entire data block and rely on the dma controller to split it up if need be
         out_msg.Destination.add(tbe.Requestor);
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org

Reply via email to