Consider trying to use "==" in place of where you have "=", which can be faster when comparing numbers for equality. Source for this and a few other performance tips:
http://gnuvince.wordpress.com/2009/05/11/clojure-performance-tips/ Andy On Mon, Aug 30, 2010 at 11:46 PM, Robert McIntyre <r...@mit.edu> wrote: > Ah, I see that I was mistaken about the timing. Sorry about that. > > After a lot of fiddling around, I cam up with this faster form: > > (defn countnl-lite > [#^bytes buf] > (areduce buf idx count (int 0) > (if (= (clojure.lang.RT/aget buf idx) 10) > (unchecked-add count 1) > count))) > > Key points are initializing count to a primitive integer and directly > calling clojure's aget to avoid an unnecessary integer cast. > > On my system: > > The unmodified countnl function takes ~ 180 msecs > > Without AOT compilation countnl-lite takes around 66 msecs > > With AOT compilation countnl-lite takes ~46 msecs > > The java method takes ~19 msecs. > > I've lost a factor of 2.25 somewhere and it makes me sad that I can't find > it. > I would be very interested if anyone could improve countnl-lite. > > --Robert McIntyre > > > > On Mon, Aug 30, 2010 at 8:41 PM, Alan <a...@malloys.org> wrote: > > I think this misses the point. Of course java, c, and clojure will all > > have roughly the same wall-clock time for this program, since it is > > dominated by the I/O. You can even see that in the output from $ time > > java Iterate: less than 0.5s was spent in user space, the rest was > > spent in system code - that is, mostly doing I/O. > > > > The java version is a second faster as counted by the wall clock, and > > this is unlikely to be a coincidence: tsuraan's timing data suggests > > that the clojure program takes 80ms longer in each loop, and loops 10 > > times. That comes out to 0.8 seconds, which is quite close to the > > differential you observed when timing from the command line. > > > > On Aug 30, 1:38 pm, Robert McIntyre <r...@mit.edu> wrote: > >> I don't know what the heck is going here, but ignore the time the > >> program is reporting and just > >> pay attention to how long it actually takes wall-clock style and > >> you'll see that your clojure and > >> java programs already take the same time. > >> > >> Here are my findings: > >> > >> I saved Iterate.java into my rlm package and ran: > >> time java -server rlm.Iterate > >> > >> results: > >> time java -server rlm.Iterate > >> Wanted 16777216 got 16777216 bytes > >> counted 65341 nls in 27 msec > >> Wanted 16777216 got 16777216 bytes > >> counted 65310 nls in 27 msec > >> Wanted 16777216 got 16777216 bytes > >> counted 66026 nls in 21 msec > >> Wanted 16777216 got 16777216 bytes > >> counted 65473 nls in 19 msec > >> Wanted 16777216 got 16777216 bytes > >> counted 65679 nls in 19 msec > >> Wanted 16777216 got 16777216 bytes > >> counted 65739 nls in 19 msec > >> Wanted 16777216 got 16777216 bytes > >> counted 65310 nls in 21 msec > >> Wanted 16777216 got 16777216 bytes > >> counted 65810 nls in 18 msec > >> Wanted 16777216 got 16777216 bytes > >> counted 65531 nls in 21 msec > >> Wanted 16777216 got 16777216 bytes > >> counted 65418 nls in 21 msec > >> > >> real 0m27.469s > >> user 0m0.472s > >> sys 0m26.638s > >> > >> I wrapped the last bunch of commands in your clojure script into a > >> (run) function: > >> (defn run [] > >> (let [ifs (FileInputStream. "/dev/urandom") > >> buf (make-array Byte/TYPE *numbytes*)] > >> (dotimes [_ 10] > >> (let [sz (.read ifs buf)] > >> (println "Wanted" *numbytes* "got" sz "bytes") > >> (let [count (time (countnl buf))] > >> (println "Got" count "nls")))))) > >> > >> and ran > >> (time (run)) at the repl: > >> > >> (time (run)) > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 183.081975 msecs" > >> Got 65894 nls > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 183.001814 msecs" > >> Got 65949 nls > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 183.061934 msecs" > >> Got 65603 nls > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 183.031131 msecs" > >> Got 65563 nls > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 183.122567 msecs" > >> Got 65696 nls > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 182.968066 msecs" > >> Got 65546 nls > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 183.058508 msecs" > >> Got 65468 nls > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 182.932395 msecs" > >> Got 65872 nls > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 183.074646 msecs" > >> Got 65498 nls > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 187.733636 msecs" > >> Got 65434 nls > >> "Elapsed time: 28510.331507 msecs" > >> nil > >> > >> Total running time for both programs is around 28 seconds. > >> The java program seems to be incorrectly reporting it's time. > >> > >> --Robert McIntyre > >> > >> On Mon, Aug 30, 2010 at 4:03 PM, tsuraan <tsur...@gmail.com> wrote: > >> > Just to try to see if clojure is a practical language for doing > >> > byte-level work (parsing files, network streams, etc), I wrote a > >> > trivial function to iterate through a buffer of bytes and count all > >> > the newlines that it sees. For my testing, I've written a C version, > >> > a Java version, and a Clojure version. I'm running each routine 10 > >> > times over a 16MB buffer read from /dev/urandom (the buffer is > >> > refreshed between each call to the newline counting function). With > >> > gcc -O0, I get about 80ms per 16MB buffer. With gcc -O3, I get ~14ms > >> > per buffer. With javac (and java -server) I get 20ms per 16MB buffer. > >> > With clojure, I get 105ms per buffer (after the jvm warms up). I'm > >> > guessing that the huge boost that java and gcc -O3 get is from > >> > converting per-byte operations to per-int ops; at least that ~4x boost > >> > looks like it would come from something like that. Is that an > >> > optimization that is unavailable to clojure? The java_interop doc > >> > makes it sound like java and clojure get the exact same bytecode when > >> > using areduce correctly, so maybe there's something I could be doing > >> > better. Here are my small programs; if somebody could suggest > >> > improvements, I'd appreciate them. > >> > >> > iterate.clj: > >> > >> > (set! *warn-on-reflection* true) > >> > (import java.io.FileInputStream) > >> > >> > (def *numbytes* (* 16 1024 1024)) > >> > >> > (defn countnl > >> > [#^bytes buf] > >> > (let [nl (byte 10)] > >> > (areduce buf idx count 0 > >> > (if (= (aget buf idx) nl) > >> > (inc count) > >> > count)))) > >> > >> > (let [ifs (FileInputStream. "/dev/urandom") > >> > buf (make-array Byte/TYPE *numbytes*)] > >> > (dotimes [_ 10] > >> > (let [sz (.read ifs buf)] > >> > (println "Wanted" *numbytes* "got" sz "bytes") > >> > (let [count (time (countnl buf))] > >> > (println "Got" count "nls"))))) > >> > >> > Iterate.java: > >> > >> > import java.io.FileInputStream; > >> > >> > class Iterate > >> > { > >> > static final int NUMBYTES = 16*1024*1024; > >> > >> > static int countnl(byte[] buf) > >> > { > >> > int count = 0; > >> > for(int i = 0; i < buf.length; i++) { > >> > if(buf[i] == '\n') { > >> > count++; > >> > } > >> > } > >> > return count; > >> > } > >> > >> > public static final void main(String[] args) > >> > throws Throwable > >> > { > >> > FileInputStream input = new FileInputStream("/dev/urandom"); > >> > byte[] buf = new byte[NUMBYTES]; > >> > int sz; > >> > long start, end; > >> > >> > for(int i = 0; i < 10; i++) { > >> > sz = input.read(buf); > >> > System.out.println("Wanted " + NUMBYTES + " got " + sz + " > bytes"); > >> > start = System.currentTimeMillis(); > >> > int count = countnl(buf); > >> > end = System.currentTimeMillis(); > >> > System.out.println("counted " + count + " nls in " + > >> > (end-start) + " msec"); > >> > } > >> > >> > input.close(); > >> > } > >> > } > >> > >> > iterate.c: > >> > >> > #include<sys/types.h> > >> > #include<sys/stat.h> > >> > #include<sys/time.h> > >> > #include<stdlib.h> > >> > #include<unistd.h> > >> > #include<stdio.h> > >> > #include<fcntl.h> > >> > >> > int countnl(char *buf, int sz) > >> > { > >> > int i; > >> > int count = 0; > >> > for(i = 0; i < sz; i++) { > >> > if(buf[i] == '\n') { > >> > count++; > >> > } > >> > } > >> > return count; > >> > } > >> > >> > int main() > >> > { > >> > int fd = open("/dev/urandom", O_RDONLY); > >> > const int NUMBYTES = 16*1024*1024; > >> > char *buf = (char*)malloc(NUMBYTES); > >> > >> > int sz; > >> > struct timeval start, end; > >> > >> > int i; > >> > for(i = 0; i < 10; i++) { > >> > sz = read(fd, buf, NUMBYTES); > >> > printf("Wanted %d bytes, got %d bytes\n", NUMBYTES, sz); > >> > gettimeofday(&start, 0); > >> > int count = countnl(buf, sz); > >> > gettimeofday(&end, 0); > >> > printf("counted %d nls in %f msec\n", count, > >> > (float)(end.tv_sec-start.tv_sec)*1e3 + > (end.tv_usec-start.tv_usec)/1e3); > >> > } > >> > >> > free(buf); > >> > close(fd); > >> > return 0; > >> > } > >> > >> > -- > >> > You received this message because you are subscribed to the Google > >> > Groups "Clojure" group. > >> > To post to this group, send email to clojure@googlegroups.com > >> > Note that posts from new members are moderated - please be patient > with your first post. > >> > To unsubscribe from this group, send email to > >> > clojure+unsubscr...@googlegroups.com<clojure%2bunsubscr...@googlegroups.com> > >> > For more options, visit this group at > >> >http://groups.google.com/group/clojure?hl=en > >> > >> > > > > -- > > You received this message because you are subscribed to the Google > > Groups "Clojure" group. > > To post to this group, send email to clojure@googlegroups.com > > Note that posts from new members are moderated - please be patient with > your first post. > > To unsubscribe from this group, send email to > > clojure+unsubscr...@googlegroups.com<clojure%2bunsubscr...@googlegroups.com> > > For more options, visit this group at > > http://groups.google.com/group/clojure?hl=en > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with > your first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com<clojure%2bunsubscr...@googlegroups.com> > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en