I think this misses the point. Of course java, c, and clojure will all have roughly the same wall-clock time for this program, since it is dominated by the I/O. You can even see that in the output from $ time java Iterate: less than 0.5s was spent in user space, the rest was spent in system code - that is, mostly doing I/O.
The java version is a second faster as counted by the wall clock, and this is unlikely to be a coincidence: tsuraan's timing data suggests that the clojure program takes 80ms longer in each loop, and loops 10 times. That comes out to 0.8 seconds, which is quite close to the differential you observed when timing from the command line. On Aug 30, 1:38 pm, Robert McIntyre <[email protected]> wrote: > I don't know what the heck is going here, but ignore the time the > program is reporting and just > pay attention to how long it actually takes wall-clock style and > you'll see that your clojure and > java programs already take the same time. > > Here are my findings: > > I saved Iterate.java into my rlm package and ran: > time java -server rlm.Iterate > > results: > time java -server rlm.Iterate > Wanted 16777216 got 16777216 bytes > counted 65341 nls in 27 msec > Wanted 16777216 got 16777216 bytes > counted 65310 nls in 27 msec > Wanted 16777216 got 16777216 bytes > counted 66026 nls in 21 msec > Wanted 16777216 got 16777216 bytes > counted 65473 nls in 19 msec > Wanted 16777216 got 16777216 bytes > counted 65679 nls in 19 msec > Wanted 16777216 got 16777216 bytes > counted 65739 nls in 19 msec > Wanted 16777216 got 16777216 bytes > counted 65310 nls in 21 msec > Wanted 16777216 got 16777216 bytes > counted 65810 nls in 18 msec > Wanted 16777216 got 16777216 bytes > counted 65531 nls in 21 msec > Wanted 16777216 got 16777216 bytes > counted 65418 nls in 21 msec > > real 0m27.469s > user 0m0.472s > sys 0m26.638s > > I wrapped the last bunch of commands in your clojure script into a > (run) function: > (defn run [] > (let [ifs (FileInputStream. "/dev/urandom") > buf (make-array Byte/TYPE *numbytes*)] > (dotimes [_ 10] > (let [sz (.read ifs buf)] > (println "Wanted" *numbytes* "got" sz "bytes") > (let [count (time (countnl buf))] > (println "Got" count "nls")))))) > > and ran > (time (run)) at the repl: > > (time (run)) > Wanted 16777216 got 16777216 bytes > "Elapsed time: 183.081975 msecs" > Got 65894 nls > Wanted 16777216 got 16777216 bytes > "Elapsed time: 183.001814 msecs" > Got 65949 nls > Wanted 16777216 got 16777216 bytes > "Elapsed time: 183.061934 msecs" > Got 65603 nls > Wanted 16777216 got 16777216 bytes > "Elapsed time: 183.031131 msecs" > Got 65563 nls > Wanted 16777216 got 16777216 bytes > "Elapsed time: 183.122567 msecs" > Got 65696 nls > Wanted 16777216 got 16777216 bytes > "Elapsed time: 182.968066 msecs" > Got 65546 nls > Wanted 16777216 got 16777216 bytes > "Elapsed time: 183.058508 msecs" > Got 65468 nls > Wanted 16777216 got 16777216 bytes > "Elapsed time: 182.932395 msecs" > Got 65872 nls > Wanted 16777216 got 16777216 bytes > "Elapsed time: 183.074646 msecs" > Got 65498 nls > Wanted 16777216 got 16777216 bytes > "Elapsed time: 187.733636 msecs" > Got 65434 nls > "Elapsed time: 28510.331507 msecs" > nil > > Total running time for both programs is around 28 seconds. > The java program seems to be incorrectly reporting it's time. > > --Robert McIntyre > > On Mon, Aug 30, 2010 at 4:03 PM, tsuraan <[email protected]> wrote: > > Just to try to see if clojure is a practical language for doing > > byte-level work (parsing files, network streams, etc), I wrote a > > trivial function to iterate through a buffer of bytes and count all > > the newlines that it sees. For my testing, I've written a C version, > > a Java version, and a Clojure version. I'm running each routine 10 > > times over a 16MB buffer read from /dev/urandom (the buffer is > > refreshed between each call to the newline counting function). With > > gcc -O0, I get about 80ms per 16MB buffer. With gcc -O3, I get ~14ms > > per buffer. With javac (and java -server) I get 20ms per 16MB buffer. > > With clojure, I get 105ms per buffer (after the jvm warms up). I'm > > guessing that the huge boost that java and gcc -O3 get is from > > converting per-byte operations to per-int ops; at least that ~4x boost > > looks like it would come from something like that. Is that an > > optimization that is unavailable to clojure? The java_interop doc > > makes it sound like java and clojure get the exact same bytecode when > > using areduce correctly, so maybe there's something I could be doing > > better. Here are my small programs; if somebody could suggest > > improvements, I'd appreciate them. > > > iterate.clj: > > > (set! *warn-on-reflection* true) > > (import java.io.FileInputStream) > > > (def *numbytes* (* 16 1024 1024)) > > > (defn countnl > > [#^bytes buf] > > (let [nl (byte 10)] > > (areduce buf idx count 0 > > (if (= (aget buf idx) nl) > > (inc count) > > count)))) > > > (let [ifs (FileInputStream. "/dev/urandom") > > buf (make-array Byte/TYPE *numbytes*)] > > (dotimes [_ 10] > > (let [sz (.read ifs buf)] > > (println "Wanted" *numbytes* "got" sz "bytes") > > (let [count (time (countnl buf))] > > (println "Got" count "nls"))))) > > > Iterate.java: > > > import java.io.FileInputStream; > > > class Iterate > > { > > static final int NUMBYTES = 16*1024*1024; > > > static int countnl(byte[] buf) > > { > > int count = 0; > > for(int i = 0; i < buf.length; i++) { > > if(buf[i] == '\n') { > > count++; > > } > > } > > return count; > > } > > > public static final void main(String[] args) > > throws Throwable > > { > > FileInputStream input = new FileInputStream("/dev/urandom"); > > byte[] buf = new byte[NUMBYTES]; > > int sz; > > long start, end; > > > for(int i = 0; i < 10; i++) { > > sz = input.read(buf); > > System.out.println("Wanted " + NUMBYTES + " got " + sz + " bytes"); > > start = System.currentTimeMillis(); > > int count = countnl(buf); > > end = System.currentTimeMillis(); > > System.out.println("counted " + count + " nls in " + > > (end-start) + " msec"); > > } > > > input.close(); > > } > > } > > > iterate.c: > > > #include<sys/types.h> > > #include<sys/stat.h> > > #include<sys/time.h> > > #include<stdlib.h> > > #include<unistd.h> > > #include<stdio.h> > > #include<fcntl.h> > > > int countnl(char *buf, int sz) > > { > > int i; > > int count = 0; > > for(i = 0; i < sz; i++) { > > if(buf[i] == '\n') { > > count++; > > } > > } > > return count; > > } > > > int main() > > { > > int fd = open("/dev/urandom", O_RDONLY); > > const int NUMBYTES = 16*1024*1024; > > char *buf = (char*)malloc(NUMBYTES); > > > int sz; > > struct timeval start, end; > > > int i; > > for(i = 0; i < 10; i++) { > > sz = read(fd, buf, NUMBYTES); > > printf("Wanted %d bytes, got %d bytes\n", NUMBYTES, sz); > > gettimeofday(&start, 0); > > int count = countnl(buf, sz); > > gettimeofday(&end, 0); > > printf("counted %d nls in %f msec\n", count, > > (float)(end.tv_sec-start.tv_sec)*1e3 + > > (end.tv_usec-start.tv_usec)/1e3); > > } > > > free(buf); > > close(fd); > > return 0; > > } > > > -- > > You received this message because you are subscribed to the Google > > Groups "Clojure" group. > > To post to this group, send email to [email protected] > > Note that posts from new members are moderated - please be patient with > > your first post. > > To unsubscribe from this group, send email to > > [email protected] > > For more options, visit this group at > >http://groups.google.com/group/clojure?hl=en > > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to [email protected] Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/clojure?hl=en
