I am encountering a critical non-determinism issue while running network 
benchmarks in Gem5-dpdk Full System (FS) mode using the Intel DPDK framework. 
Despite using a cycle-accurate CPU model, a stable checkpoint, and fixed 
hardware configurations, the simulation results are varying between identical 
runs, which violates the expected determinism of the simulator.
System Configuration & Environment

  *   Gem5 Version: gem5-dpdk (ISPASS’24), for DPDK
  *   CPU Model: Core simulation -> O3_ARM_v7a_3, Checkpoint -> AtomicSimpleCPU
  *   Environment: Gem5 FS Mode (running from a checkpoint), KVM is NOT used, 
application contains no random logic.
  *   Application: DPDK MACSWAP (Network benchmark).

The Critical Bug: Non-Deterministic TXPackets

The received packets (RXPackets) remain perfectly deterministic, but the 
transmitted packets (TXPackets) count varies significantly. This strongly 
suggests a timing-dependent race condition that affects the completion of the 
TX process near the end of the simulation.

The table below shows the results from two consecutive runs with identical 
configurations (e.g., 128-byte packet size, 233 Gbps rate), where the TX packet 
count differs substantially.

Metric

Run 1

Run 2

RX

13924017

13924017

TX

11448637

13922306


So, my question is :
1. Is this non-deterministic result a genuine problem? Specifically, should 
experiments executed in Gem5 FS mode with identical configurations yield 
perfectly deterministic results?
2. Have others encountered non-deterministic results under similar conditions 
(FS mode, NIC I/O experiments) despite using a stable configuration? If so, 
what was the root cause of the problem (e.g., a specific flaw in the NIC model, 
DMA timing issue, or event queue race), and how was it resolved (e.g., a code 
patch or specific config fix)?

Best regards,
Sungwook.
_______________________________________________
gem5-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to