Hi All,

I ran the attached microbenchmark (test1.c) with gem5 in FS and SE mode. I
am interested in the stats only for the region marked with the ROI marker.


I assume the FS mode is similar to or slower than the SE mode for the
microbenchmark in the ROI, but the stats point otherwise. The FS mode is 3%
faster than the SE mode.


 I am using the gem5 V20 for SE mode, and the gem5 provided resources
V20.0.3
<https://gem5.googlesource.com/public/gem5-resources/+/refs/tags/v20.0.0.3/>
for
FS mode.

Can the difference in the clock cycle of the memory controller and cache
proxy port result in such differences? What can be the other possible
reasons?


Regards,

Vipin

Ph.D. Scholar IIT Kanpur
/**
 * A micro benchmark program with parallel accesses.
 *
 */

#include <pthread.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
// for measuring the ROI in FS mode
#ifdef ROI_TRACING
  #include "m5_library/hooks_prospar.h"
#endif
// for measuring the ROI in SE mode
#ifdef SE_MODE_BUILD
  #include "gem5/m5ops.h"
#endif

#define NUM_THREADS 4

int loop_count __attribute__((aligned(64))) = 0;
void* thread_run(void*);

uint32_t arr[NUM_THREADS * 16]
    __attribute__((aligned(64))) = {0}; // per entry 4 bytes: 1 block per thread

int main(int argc, char* argv[]) {
  if (argc == 2) {
    loop_count = (1 << atoi(argv[1]));
  } else {
    loop_count = (1 << 18);
  }

  pthread_t threads[NUM_THREADS];

  printf("Starting address of array: %p \n", (void*)&arr);
  
  #ifdef ROI_TRACING
    roi_begin();
  #endif
  #ifdef SE_MODE_BUILD
    m5_reset_stats(0,0);
  #endif
  for (int i = 0; i < NUM_THREADS; i++) {
    pthread_create(&threads[i], NULL, &thread_run, (void*)(intptr_t)i);
  }

  for (int i = 0; i < NUM_THREADS; i++) {
    pthread_join(threads[i], NULL);
  }
  #ifdef ROI_TRACING
    roi_end();
  #endif
  #ifdef SE_MODE_BUILD
    m5_dump_stats(0,0);
  #endif
  // Correctness check
  for (int i = 0; i < NUM_THREADS; i++) {
    if (arr[i * 16] != loop_count)
      printf("Diff found for index: %d actual value:%d\n", i, arr[i]);
  }

  return 0;
}

void* thread_run(void* threadId) {
  int currID = (intptr_t)threadId;
  for (uint32_t i = 0; i < loop_count; i++) {
    arr[currID * 16] += 1;
  }
  return NULL;
}

_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org

Reply via email to