This is a slightly different result as this time I measure elapsed time
(see appendix and excuse not so nice code) as opposed to a clock time.
Results are similar (unless you have more processes than cores). I am
planning to release the code to github soon.

+----------+--------+-------+
| # of     |  seq   | rdm   |
|processes |        |       |
+----------+--------+-------+
|1         | 1.00   | 1.00  |
+----------+--------+-------+
|2         | 1.96   | 1.75  |
+----------+--------+-------+
|4         | 3.20   | 1.83  |
+----------+--------+-------+
|8         | 3.78   | 1.83  |
+----------+--------+-------+
|16        | 3.61   | 1.81  |
+----------+--------+-------+
|32        | 3.56   | 1.81  |
+----------+--------+-------+


This is great stuff.
> Let me make sure I read it correctly.
> Having 2 processes makes a value 1.97 times higher than with 1 core in the
> random case, and 1.76 times higher in the linear case, but what is that
> value being measured?
> Some form of throughput I suppose and not time, right?
>

I think you could call that a normalized throughput. Here are more details.
First column is the number of separate O/S test processes running in
background concurrently, started by the shell script virtually at the same
time. Then I collected the output which simply logs how long it takes to
iterate thru 40MB of memory in sequential or random manner.  Second and
third column are
number_of_processes/elapsed_time*elapsed_time_from_first_row_process for
sequential and random access respectively. In ideal conditions,
elapsed_time should be constant as we use more and more cores/CPUs.



> Indeed. It also means single threaded linear access isn't going to be very
> much faster if you add more threads.
> BTW, are you sure the threads were running in parallel on separate cores
> and not just concurrently on a smaller number of cores?
> As you said, this should be dependent on hardware and running this on
> actual server machine would be as interesting.
>
> I wanted to see the worse case, separate processes and memory , which was
simplest to implement. Yes, I am sure that cores were utilized by OS as the
number of processed added. I watched MBP Activity Monitor and CPU History
which was hitting 100%. Also I did not optimize the output (should not
matter).

One interesting thing, I heard somewhere, that O/Ss bounce long lasting CPU
intensive threads between cores in order to equalize the heat generated
from the silicon. I did not observe that but longest running test took
about 15 sec using a single core.


Thx,
Andy

Appendix 1:

        {
            int n;
            int j;
            struct timeval begin,end;
            gettimeofday (&begin, NULL);
            for(n=0;n<100;n++)
                for (j=0;j<len;j++)
                    i_array[rand()%len]=j;
            gettimeofday (&end, NULL);
            printf("[:rdm %s %d %d %f]\n",des,len,m,tdiff(&begin,&end));
        }
        {
            int n;
            int j;
            struct timeval begin,end;
            gettimeofday (&begin, NULL);
            for(n=0;n<100;n++)
                for (j=0;j<len;j++)
                    i_array[j]=j;
            gettimeofday (&end, NULL);
            printf("[:seq %s %d %d %f]\n",des,len,m,tdiff(&begin,&end));
        }

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to