Hi, My mistake, DMA are indeed handled coherently by the MESI_Two_Level protocol. I did encounter such issue with non-coherent DMA accesses in our own protocol. I wrongly assumed the same was happening here. I also overlooked the uncacheable parameter that happens to be override in the upstream ruby_mem_test.py. Your issue looked so much similar to one I’ve had that I did not double check these points.
So, DMA accesses should not fail in any case with the MESI_Two_Level protocol. Let’s start over again. In MESI_Two_level-dir.sm, a DMA_WRITE event causes the qw_queueMemoryWBRequest_partial to be executed which sends a MEMORY_WB request to memory. The request is supposed to send: 1. the raw data block which is composed by the DMASequencer * The DMASequencer aligns the data block on a cacheline as expected by the ruby infrastructure (DMASequencer.cc:153) 2. the length of the access (always 1 byte for the memory tester) 3. the physical address of the access * The problem might be there as the variable *address* on line MESI_Two_Level-dir.sm:368 is set to the physical address of the access **aligned to a cacheline** (MESI_Two_Level-dir.sm:207) The actual request that will then be sent to the memory controller is then produced by AbstractController::serviceMemoryQueue(). In that function, the data is retrieved in the block assuming the base address of the block is cacheline-aligned (AbstractController.cc:281). The result is that you always access the base address of a block when performing a DMA access with the MESI_Two_level protocol. You can try the patch I’ve attached. It applies on v22.0.0.2. It will make MESI_Two_Level work as long as DMA accesses target a different memory region than CPU accesses (percent_uncacheable = 100). Now, if you revert percent_uncacheable to 0, problems are back and this time for real. Namely, The directory model is not wired up to handle all possible data sharing states together with DMA accesses. Now, it’s you to decide whether you want to take the red pill or the blue pill ;) I’ll stick to the blue one and avoid diving into the rabbit hole that fixing a Ruby protocol often is. If it works for you and you luckily avoid the unsupported scenarios, then you should be good to go. Regarding your comment > This test was for MESI_Two_Level for which DMA should be working as it boots > up in Full System mode It could very well be that every single DMA access performed by Linux happens to be cacheline-aligned for various good reasons. Booting Linux is not all that demanding on the coherency protocol. Its more about getting a few atomic accesses right and providing support for all the machinery going around the bare CPU core and memory (iterrupt controller(s), timers, MMU, file system storage back end, etc). Memory mapped register accesses and IOs are not very demanding in terms of coherency and it is very likely that performing non-coherent, AXI-like, accesses would work just fine. ruby_mem_test.py is an actual torture test for the coherence protocol that surpasses anything an actual program could ever ask for. Best, Gabriel
diff --git a/configs/example/ruby_mem_test.py b/configs/example/ruby_mem_test.py index b16b295f0f..442cb136df 100644 --- a/configs/example/ruby_mem_test.py +++ b/configs/example/ruby_mem_test.py @@ -99,7 +99,7 @@ system = System(cpu = cpus, if args.num_dmas > 0: dmas = [ MemTest(max_loads = args.maxloads, percent_functional = 0, - percent_uncacheable = 0, + percent_uncacheable = 100, progress_interval = args.progress, suppress_func_errors = not args.suppress_func_errors) \ @@ -110,7 +110,7 @@ else: dma_ports = [] for (i, dma) in enumerate(dmas): - dma_ports.append(dma.test) + dma_ports.append(dma.port) Ruby.create_system(args, False, system, dma_ports = dma_ports) # Create a top-level voltage domain and clock domain diff --git a/src/cpu/testers/memtest/memtest.cc b/src/cpu/testers/memtest/memtest.cc index 7c256d8642..5fefc7f899 100644 --- a/src/cpu/testers/memtest/memtest.cc +++ b/src/cpu/testers/memtest/memtest.cc @@ -220,7 +220,7 @@ MemTest::tick() // create a new request unsigned cmd = random_mt.random(0, 100); uint8_t data = random_mt.random<uint8_t>(); - bool uncacheable = random_mt.random(0, 100) < percentUncacheable; + bool uncacheable = random_mt.random(1, 100) <= percentUncacheable; unsigned base = random_mt.random(0, 1); Request::Flags flags; Addr paddr; diff --git a/src/mem/ruby/protocol/MESI_Two_Level-dir.sm b/src/mem/ruby/protocol/MESI_Two_Level-dir.sm index 9d6975570c..53672b280f 100644 --- a/src/mem/ruby/protocol/MESI_Two_Level-dir.sm +++ b/src/mem/ruby/protocol/MESI_Two_Level-dir.sm @@ -234,10 +234,11 @@ machine(MachineType:Directory, "MESI Two Level directory protocol") in_port(memQueue_in, MemoryMsg, responseFromMemory, rank = 2) { if (memQueue_in.isReady(clockEdge())) { peek(memQueue_in, MemoryMsg) { + Addr lineAddr := makeLineAddress(in_msg.addr); if (in_msg.Type == MemoryRequestType:MEMORY_READ) { - trigger(Event:Memory_Data, in_msg.addr, TBEs[in_msg.addr]); + trigger(Event:Memory_Data, lineAddr, TBEs[lineAddr]); } else if (in_msg.Type == MemoryRequestType:MEMORY_WB) { - trigger(Event:Memory_Ack, in_msg.addr, TBEs[in_msg.addr]); + trigger(Event:Memory_Ack, lineAddr, TBEs[lineAddr]); } else { DPRINTF(RubySlicc, "%s\n", in_msg.Type); error("Invalid message"); @@ -352,7 +353,7 @@ machine(MachineType:Directory, "MESI Two Level directory protocol") peek(memQueue_in, MemoryMsg) { enqueue(responseNetwork_out, ResponseMsg, to_mem_ctrl_latency) { assert(is_valid(tbe)); - out_msg.addr := address; + out_msg.addr := tbe.PhysicalAddress; out_msg.Type := CoherenceResponseType:DATA; out_msg.DataBlk := in_msg.DataBlk; // we send the entire data block and rely on the dma controller to split it up if need be out_msg.Destination.add(tbe.Requestor); @@ -365,7 +366,7 @@ machine(MachineType:Directory, "MESI Two Level directory protocol") desc="Queue off-chip writeback request") { peek(requestNetwork_in, RequestMsg) { enqueue(memQueue_out, MemoryMsg, to_mem_ctrl_latency) { - out_msg.addr := address; + out_msg.addr := tbe.PhysicalAddress; out_msg.Type := MemoryRequestType:MEMORY_WB; out_msg.Sender := machineID; out_msg.MessageSize := MessageSizeType:Writeback_Data; @@ -378,7 +379,7 @@ machine(MachineType:Directory, "MESI Two Level directory protocol") action(da_sendDMAAck, "da", desc="Send Ack to DMA controller") { enqueue(responseNetwork_out, ResponseMsg, to_mem_ctrl_latency) { assert(is_valid(tbe)); - out_msg.addr := address; + out_msg.addr := tbe.PhysicalAddress; out_msg.Type := CoherenceResponseType:ACK; out_msg.Destination.add(tbe.Requestor); out_msg.MessageSize := MessageSizeType:Writeback_Control; @@ -410,7 +411,7 @@ machine(MachineType:Directory, "MESI Two Level directory protocol") peek(responseNetwork_in, ResponseMsg) { enqueue(responseNetwork_out, ResponseMsg, to_mem_ctrl_latency) { assert(is_valid(tbe)); - out_msg.addr := address; + out_msg.addr := tbe.PhysicalAddress; out_msg.Type := CoherenceResponseType:DATA; out_msg.DataBlk := in_msg.DataBlk; // we send the entire data block and rely on the dma controller to split it up if need be out_msg.Destination.add(tbe.Requestor);
_______________________________________________ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org