Consider trying to use "==" in place of where you have "=", which can be
faster when comparing numbers for equality.  Source for this and a few other
performance tips:

http://gnuvince.wordpress.com/2009/05/11/clojure-performance-tips/

Andy

On Mon, Aug 30, 2010 at 11:46 PM, Robert McIntyre <r...@mit.edu> wrote:

> Ah, I see that I was mistaken about the timing. Sorry about that.
>
> After a lot of fiddling around, I cam up with this faster form:
>
> (defn countnl-lite
>  [#^bytes buf]
>  (areduce buf idx count (int 0)
>           (if (= (clojure.lang.RT/aget buf idx) 10)
>             (unchecked-add count 1)
>             count)))
>
> Key points are initializing count to a primitive integer and directly
> calling clojure's aget to avoid an unnecessary integer cast.
>
> On my system:
>
> The unmodified countnl function takes ~ 180 msecs
>
> Without AOT compilation countnl-lite takes around 66 msecs
>
> With AOT compilation countnl-lite takes ~46 msecs
>
> The java method takes ~19 msecs.
>
> I've lost a factor of 2.25 somewhere and it makes me sad that I can't find
> it.
> I would be very interested if anyone could improve countnl-lite.
>
> --Robert McIntyre
>
>
>
> On Mon, Aug 30, 2010 at 8:41 PM, Alan <a...@malloys.org> wrote:
> > I think this misses the point. Of course java, c, and clojure will all
> > have roughly the same wall-clock time for this program, since it is
> > dominated by the I/O. You can even see that in the output from $ time
> > java Iterate: less than 0.5s was spent in user space, the rest was
> > spent in system code - that is, mostly doing I/O.
> >
> > The java version is a second faster as counted by the wall clock, and
> > this is unlikely to be a coincidence: tsuraan's timing data suggests
> > that the clojure program takes 80ms longer in each loop, and loops 10
> > times. That comes out to 0.8 seconds, which is quite close to the
> > differential you observed when timing from the command line.
> >
> > On Aug 30, 1:38 pm, Robert McIntyre <r...@mit.edu> wrote:
> >> I don't know what the heck is going here, but ignore the time the
> >> program is reporting and just
> >> pay attention to how long it actually takes wall-clock style and
> >> you'll see that your clojure and
> >> java programs already take the same time.
> >>
> >> Here are my findings:
> >>
> >> I saved Iterate.java into my rlm package and ran:
> >> time java -server rlm.Iterate
> >>
> >> results:
> >> time java -server rlm.Iterate
> >> Wanted 16777216 got 16777216 bytes
> >> counted 65341 nls in 27 msec
> >> Wanted 16777216 got 16777216 bytes
> >> counted 65310 nls in 27 msec
> >> Wanted 16777216 got 16777216 bytes
> >> counted 66026 nls in 21 msec
> >> Wanted 16777216 got 16777216 bytes
> >> counted 65473 nls in 19 msec
> >> Wanted 16777216 got 16777216 bytes
> >> counted 65679 nls in 19 msec
> >> Wanted 16777216 got 16777216 bytes
> >> counted 65739 nls in 19 msec
> >> Wanted 16777216 got 16777216 bytes
> >> counted 65310 nls in 21 msec
> >> Wanted 16777216 got 16777216 bytes
> >> counted 65810 nls in 18 msec
> >> Wanted 16777216 got 16777216 bytes
> >> counted 65531 nls in 21 msec
> >> Wanted 16777216 got 16777216 bytes
> >> counted 65418 nls in 21 msec
> >>
> >> real    0m27.469s
> >> user    0m0.472s
> >> sys     0m26.638s
> >>
> >> I wrapped the last bunch of commands in your clojure script into a
> >> (run) function:
> >> (defn run []
> >>   (let [ifs (FileInputStream. "/dev/urandom")
> >>         buf (make-array Byte/TYPE *numbytes*)]
> >>     (dotimes [_ 10]
> >>       (let [sz (.read ifs buf)]
> >>         (println "Wanted" *numbytes* "got" sz "bytes")
> >>         (let [count (time (countnl buf))]
> >>           (println "Got" count "nls"))))))
> >>
> >> and ran
> >> (time (run)) at the repl:
> >>
> >> (time (run))
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 183.081975 msecs"
> >> Got 65894 nls
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 183.001814 msecs"
> >> Got 65949 nls
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 183.061934 msecs"
> >> Got 65603 nls
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 183.031131 msecs"
> >> Got 65563 nls
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 183.122567 msecs"
> >> Got 65696 nls
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 182.968066 msecs"
> >> Got 65546 nls
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 183.058508 msecs"
> >> Got 65468 nls
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 182.932395 msecs"
> >> Got 65872 nls
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 183.074646 msecs"
> >> Got 65498 nls
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 187.733636 msecs"
> >> Got 65434 nls
> >> "Elapsed time: 28510.331507 msecs"
> >> nil
> >>
> >> Total running time for both programs is around 28 seconds.
> >> The java program seems to be incorrectly reporting it's time.
> >>
> >> --Robert McIntyre
> >>
> >> On Mon, Aug 30, 2010 at 4:03 PM, tsuraan <tsur...@gmail.com> wrote:
> >> > Just to try to see if clojure is a practical language for doing
> >> > byte-level work (parsing files, network streams, etc), I wrote a
> >> > trivial function to iterate through a buffer of bytes and count all
> >> > the newlines that it sees.  For my testing, I've written a C version,
> >> > a Java version, and a Clojure version.  I'm running each routine 10
> >> > times over a 16MB buffer read from /dev/urandom (the buffer is
> >> > refreshed between each call to the newline counting function).  With
> >> > gcc -O0, I get about 80ms per 16MB buffer.  With gcc -O3, I get ~14ms
> >> > per buffer.  With javac (and java -server) I get 20ms per 16MB buffer.
> >> >  With clojure, I get 105ms per buffer (after the jvm warms up).  I'm
> >> > guessing that the huge boost that java and gcc -O3 get is from
> >> > converting per-byte operations to per-int ops; at least that ~4x boost
> >> > looks like it would come from something like that.  Is that an
> >> > optimization that is unavailable to clojure?  The java_interop doc
> >> > makes it sound like java and clojure get the exact same bytecode when
> >> > using areduce correctly, so maybe there's something I could be doing
> >> > better.  Here are my small programs; if somebody could suggest
> >> > improvements, I'd appreciate them.
> >>
> >> > iterate.clj:
> >>
> >> > (set! *warn-on-reflection* true)
> >> > (import java.io.FileInputStream)
> >>
> >> > (def *numbytes* (* 16 1024 1024))
> >>
> >> > (defn countnl
> >> >  [#^bytes buf]
> >> >  (let [nl (byte 10)]
> >> >    (areduce buf idx count 0
> >> >             (if (= (aget buf idx) nl)
> >> >               (inc count)
> >> >               count))))
> >>
> >> > (let [ifs (FileInputStream. "/dev/urandom")
> >> >      buf (make-array Byte/TYPE *numbytes*)]
> >> >  (dotimes [_ 10]
> >> >    (let [sz (.read ifs buf)]
> >> >      (println "Wanted" *numbytes* "got" sz "bytes")
> >> >      (let [count (time (countnl buf))]
> >> >        (println "Got" count "nls")))))
> >>
> >> > Iterate.java:
> >>
> >> > import java.io.FileInputStream;
> >>
> >> > class Iterate
> >> > {
> >> >  static final int NUMBYTES = 16*1024*1024;
> >>
> >> >  static int countnl(byte[] buf)
> >> >  {
> >> >    int count = 0;
> >> >    for(int i = 0; i < buf.length; i++) {
> >> >      if(buf[i] == '\n') {
> >> >        count++;
> >> >      }
> >> >    }
> >> >    return count;
> >> >  }
> >>
> >> >  public static final void main(String[] args)
> >> >    throws Throwable
> >> >  {
> >> >    FileInputStream input = new FileInputStream("/dev/urandom");
> >> >    byte[] buf = new byte[NUMBYTES];
> >> >    int sz;
> >> >    long start, end;
> >>
> >> >    for(int i = 0; i < 10; i++) {
> >> >      sz = input.read(buf);
> >> >      System.out.println("Wanted " + NUMBYTES + " got " + sz + "
> bytes");
> >> >      start = System.currentTimeMillis();
> >> >      int count = countnl(buf);
> >> >      end = System.currentTimeMillis();
> >> >      System.out.println("counted " + count + " nls in " +
> >> >          (end-start) + " msec");
> >> >    }
> >>
> >> >    input.close();
> >> >  }
> >> > }
> >>
> >> > iterate.c:
> >>
> >> > #include<sys/types.h>
> >> > #include<sys/stat.h>
> >> > #include<sys/time.h>
> >> > #include<stdlib.h>
> >> > #include<unistd.h>
> >> > #include<stdio.h>
> >> > #include<fcntl.h>
> >>
> >> > int countnl(char *buf, int sz)
> >> > {
> >> >  int i;
> >> >  int count = 0;
> >> >  for(i = 0; i < sz; i++) {
> >> >    if(buf[i] == '\n') {
> >> >      count++;
> >> >    }
> >> >  }
> >> >  return count;
> >> > }
> >>
> >> > int main()
> >> > {
> >> >  int fd = open("/dev/urandom", O_RDONLY);
> >> >  const int NUMBYTES = 16*1024*1024;
> >> >  char *buf = (char*)malloc(NUMBYTES);
> >>
> >> >  int sz;
> >> >  struct timeval start, end;
> >>
> >> >  int i;
> >> >  for(i = 0; i < 10; i++) {
> >> >    sz = read(fd, buf, NUMBYTES);
> >> >    printf("Wanted %d bytes, got %d bytes\n", NUMBYTES, sz);
> >> >    gettimeofday(&start, 0);
> >> >    int count = countnl(buf, sz);
> >> >    gettimeofday(&end, 0);
> >> >    printf("counted %d nls in %f msec\n", count,
> >> >        (float)(end.tv_sec-start.tv_sec)*1e3 +
> (end.tv_usec-start.tv_usec)/1e3);
> >> >  }
> >>
> >> >  free(buf);
> >> >  close(fd);
> >> >  return 0;
> >> > }
> >>
> >> > --
> >> > You received this message because you are subscribed to the Google
> >> > Groups "Clojure" group.
> >> > To post to this group, send email to clojure@googlegroups.com
> >> > Note that posts from new members are moderated - please be patient
> with your first post.
> >> > To unsubscribe from this group, send email to
> >> > clojure+unsubscr...@googlegroups.com<clojure%2bunsubscr...@googlegroups.com>
> >> > For more options, visit this group at
> >> >http://groups.google.com/group/clojure?hl=en
> >>
> >>
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "Clojure" group.
> > To post to this group, send email to clojure@googlegroups.com
> > Note that posts from new members are moderated - please be patient with
> your first post.
> > To unsubscribe from this group, send email to
> > clojure+unsubscr...@googlegroups.com<clojure%2bunsubscr...@googlegroups.com>
> > For more options, visit this group at
> > http://groups.google.com/group/clojure?hl=en
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com<clojure%2bunsubscr...@googlegroups.com>
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to