This is a slightly different result as this time I measure elapsed time (see appendix and excuse not so nice code) as opposed to a clock time. Results are similar (unless you have more processes than cores). I am planning to release the code to github soon.
+----------+--------+-------+ | # of | seq | rdm | |processes | | | +----------+--------+-------+ |1 | 1.00 | 1.00 | +----------+--------+-------+ |2 | 1.96 | 1.75 | +----------+--------+-------+ |4 | 3.20 | 1.83 | +----------+--------+-------+ |8 | 3.78 | 1.83 | +----------+--------+-------+ |16 | 3.61 | 1.81 | +----------+--------+-------+ |32 | 3.56 | 1.81 | +----------+--------+-------+ This is great stuff. > Let me make sure I read it correctly. > Having 2 processes makes a value 1.97 times higher than with 1 core in the > random case, and 1.76 times higher in the linear case, but what is that > value being measured? > Some form of throughput I suppose and not time, right? > I think you could call that a normalized throughput. Here are more details. First column is the number of separate O/S test processes running in background concurrently, started by the shell script virtually at the same time. Then I collected the output which simply logs how long it takes to iterate thru 40MB of memory in sequential or random manner. Second and third column are number_of_processes/elapsed_time*elapsed_time_from_first_row_process for sequential and random access respectively. In ideal conditions, elapsed_time should be constant as we use more and more cores/CPUs. > Indeed. It also means single threaded linear access isn't going to be very > much faster if you add more threads. > BTW, are you sure the threads were running in parallel on separate cores > and not just concurrently on a smaller number of cores? > As you said, this should be dependent on hardware and running this on > actual server machine would be as interesting. > > I wanted to see the worse case, separate processes and memory , which was simplest to implement. Yes, I am sure that cores were utilized by OS as the number of processed added. I watched MBP Activity Monitor and CPU History which was hitting 100%. Also I did not optimize the output (should not matter). One interesting thing, I heard somewhere, that O/Ss bounce long lasting CPU intensive threads between cores in order to equalize the heat generated from the silicon. I did not observe that but longest running test took about 15 sec using a single core. Thx, Andy Appendix 1: { int n; int j; struct timeval begin,end; gettimeofday (&begin, NULL); for(n=0;n<100;n++) for (j=0;j<len;j++) i_array[rand()%len]=j; gettimeofday (&end, NULL); printf("[:rdm %s %d %d %f]\n",des,len,m,tdiff(&begin,&end)); } { int n; int j; struct timeval begin,end; gettimeofday (&begin, NULL); for(n=0;n<100;n++) for (j=0;j<len;j++) i_array[j]=j; gettimeofday (&end, NULL); printf("[:seq %s %d %d %f]\n",des,len,m,tdiff(&begin,&end)); } -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.