Hi everyone,
I'm a new Cell BE user, and my debug platform is YDL6.2 run on PS3.
I wrote two simple test programs that tests Mark Nelson's memcpy routines under
a number of conditions, and the test arrays are aligned on 128-byte boundaries.
Following explains my three files.
1. The main1.C is my first test program, it assigns two 16KB arrays.
2. The main2.C is my second test program, it assigns two 4KB arrays.
3. The test_result.TXT is my test result.
Therefore, I have three questions.
1. How to calculate the bandwidth (MB/s)?
2. Is my result or test way correct?
3. Why is the 4KB result different between main1.C and main2.C?
PS: Please use UltraEdit open my three flies.
#include <stdlib.h>
#include <stdio.h>
#include <ppu_intrinsics.h>
extern void *memcpy(void *, const void *, size_t);
int main() {
unsigned char *s, *d;
register unsigned long long start, finish, i;
if(posix_memalign((void *)&s, 128, 4096)) {
printf("failed\n");
goto end;
}
for(i = 32, j = 0; i > 0; i--, j += 128) { /* clear s from
cache */
__dcbf(s+j);
}
if(posix_memalign((void *)&d, 128, 4096)) {
printf("failed\n");
goto end;
}
for(i = 32, j = 0; i > 0; i--, j += 128) { /* clear d from
cache */
__dcbf(s+j);
}
__dcbt(cpymem); /* prefetch memcpy */
__dcbt(cpymem+128);
__dcbt(cpymem+256);
__dcbt(cpymem+384);
__dcbt(cpymem+512);
__dcbt(cpymem+640);
__dcbt(cpymem+768);
__dcbt(cpymem+896);
__dcbt(cpymem+1024);
__dcbt(cpymem+1152);
__dcbt(cpymem+1280);
__dcbt(cpymem+1408);
__dcbt(cpymem+1536);
__dcbt(cpymem+1664);
__dcbt(cpymem+1792);
__dcbt(cpymem+1920);
__dcbt(cpymem+2048);
__dcbt(cpymem+2176);
__dcbt(cpymem+2304);
__dcbt(cpymem+2432);
__sync();
start = __mftb();
memcpy(d, s, 4096);
finish = __mftb();
printf("%lld\n", finish - start);
free(s);
free(d);
end:
return 0;
}
Copy Memory-to-Memory, results are in ticks.
16KB 4KB 2KB 1KB 512B 256B 128B 64B
32B 16B
-----------------------------------------------------------------------------------------------------------------------------------
main1.C 2063~2466 669~879 60~63 33~36 22~24 19~21
17~21 17~19 15~17 15~16
main2.C 110~114 60~62
33~36 21~24 19~21 17~18 17~18 15~16
15~16
Copy Memory-to-Memory, results are in ticks.
16KB 4KB 2KB 1KB 512B 256B 128B 64B
32B 16B
-----------------------------------------------------------------------------------------------------------------------------------
main1.C 2063~2466 669~879 60~63 33~36 22~24 19~21
17~21 17~19 15~17 15~16
main2.C 110~114 60~62
33~36 21~24 19~21 17~18 17~18 15~16
15~16
_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev