Re: slow raw io

2010-08-09 Thread David Powell
> Maybe this seems like a low-priority issue but I think slurp is likely > to be very commonly used. For instance, the Riak tutorial just posted > to Hacker News uses it: > http://mmcgrana.github.com/2010/08/riak-clojure.html In the past I've steered clear of using slurp because it didn't hand ch

Re: slow raw io

2010-08-09 Thread cageface
On Aug 7, 5:43 am, Peter Schuller wrote: > Interesting. Why do you consider it recommended to read one character > at a time in a case like this? Maybe there is such a recommendation > that I don't know about, but in general I would consider it contrary > to expected practice when doing I/O if per

Re: slow raw io

2010-08-09 Thread David Powell
> This isn't an issue with the buffering, it is an issue with the massive > overhead of doing character at a time i/o - it is something that you really > should never ever do. I'd say something somewhere doing character at a > time i/o is probably the number one cause of crippling performance pr

Re: slow raw io

2010-08-09 Thread David Powell
On Sat 07/08/10 14:02 , "Stuart Halloway" stuart.hallo...@gmail.com sent: > No. We want to collect more information and do more comparisons before > moving away from the recommended Java buffering. > Stu This isn't an issue with the buffering, it is an issue with the massive overhead of doing

Re: slow raw io

2010-08-07 Thread j-g-faustus
On Aug 7, 2:02 pm, Stuart Halloway wrote: > > No. We want to collect more information and do more comparisons before > > moving away from the recommended Java buffering. Maybe this comparison can be of interest? http://nadeausoftware.com/articles/2008/02/java_tip_how_read_files_quickly Somebody

Re: slow raw io

2010-08-07 Thread Peter Schuller
> No. We want to collect more information and do more comparisons before moving > away from the recommended Java buffering. Interesting. Why do you consider it recommended to read one character at a time in a case like this? Maybe there is such a recommendation that I don't know about, but in gen

Re: slow raw io

2010-08-07 Thread Stuart Halloway
No. We want to collect more information and do more comparisons before moving away from the recommended Java buffering. Stu > Any chance of getting this in before 1.2? > > On Jun 25, 7:43 am, cageface wrote: >> Thanks Stuart & Peter for following up on this. Now I can get back to >> plowing t

Re: slow raw io

2010-08-07 Thread cageface
Any chance of getting this in before 1.2? On Jun 25, 7:43 am, cageface wrote: > Thanks Stuart & Peter for following up on this. Now I can get back to > plowing through this mountain of ldiff data with Clojure! -- You received this message because you are subscribed to the Google Groups "Clojure

Re: slow raw io

2010-06-25 Thread cageface
Thanks Stuart & Peter for following up on this. Now I can get back to plowing through this mountain of ldiff data with Clojure! -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that pos

Re: slow raw io

2010-06-25 Thread Peter Schuller
> I can register another account (no problem), but what implications are > there on the fact that I wrote 'scode' on the contributor agreement I > mail:ed Rich? I just registered "peterschuller". -- / Peter Schuller -- You received this message because you are subscribed to the Google Groups "

Re: slow raw io

2010-06-25 Thread Peter Schuller
> You are on the contributors list, so I just need to know your account name on > Assembla to activate your ability to add tickets, patches, etc. Let me know > your account name (which needs to be some permutation of  your real name, not > a nick). When I read up before submitting the contribut

Re: slow raw io

2010-06-25 Thread Stuart Halloway
Hi Peter, You are on the contributors list, so I just need to know your account name on Assembla to activate your ability to add tickets, patches, etc. Let me know your account name (which needs to be some permutation of your real name, not a nick). Thanks, Stu >> I put a self-contained test

Re: slow raw io

2010-06-25 Thread Peter Schuller
And reading the thread history I realize the problem was already identified (sorry), however: > Has anyone else had a chance to try this? I'm surprised to see manual > buffering behaving so much better than the BufferedReader > implementation but it does seem to make quite a difference. Not reall

Re: slow raw io

2010-06-25 Thread Peter Schuller
> I put a self-contained test up here: > http://gist.github.com/452095 > > To run it copy this to slurptest.clj and run these commands > java clojure.main slurptest.clj makewords 100 (100 seems good for > macs, 300 for linux) > > java -Xmx3G -Xms3G clojure.main slurptest.clj slurp| > slurp2 > > Try

Re: slow raw io

2010-06-24 Thread cageface
I put a self-contained test up here: http://gist.github.com/452095 To run it copy this to slurptest.clj and run these commands java clojure.main slurptest.clj makewords 100 (100 seems good for macs, 300 for linux) java -Xmx3G -Xms3G clojure.main slurptest.clj slurp| slurp2 Trying either slurp or

Re: slow raw io

2010-06-24 Thread cageface
Has anyone else had a chance to try this? I'm surprised to see manual buffering behaving so much better than the BufferedReader implementation but it does seem to make quite a difference. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this

Re: slow raw io

2010-06-23 Thread cageface
Interesting. Here are the times I get: LINUX: slurp, *in* 18.8 seconds slurp, System/in 18.2 seconds slurp2, *in* 6.7 seconds slurp2, System/in 5.7 seconds I have an intel iMac here too, running 10.6.4: slurp, *in* 20.4 seconds slurp, System.in 19.0 seconds slurp2, *in* 7.2 seconds slurp2, System

Re: slow raw io

2010-06-23 Thread Stuart Halloway
On my laptop (Mac) the biggest difference here has nothing to do with buffering in slurp. It is whether you use System/in (fast) or *in* (slow). The latter is a LineNumberingPushbackReader. Can you check and confirm? When I slurp System/in it is more than twice as fast as slurping *in*. I beli

Re: slow raw io

2010-06-23 Thread cageface
Another example. I'm running this on a Ubuntu 10.04 laptop with this java: java version "1.6.0_18" OpenJDK Runtime Environment (IcedTea6 1.8) (6b18-1.8-0ubuntu1) OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode) and this command line: java -Xmx3G -server clojure.main cat2.clj (require '[clo

Re: slow raw io

2010-06-23 Thread cageface
Sure. Here's my clj script: #!/bin/sh if [ -z "$1" ]; then exec java -server jline.ConsoleRunner clojure.main else SCRIPT=$(dirname $1) export CLASSPATH=$SCRIPT/*:$SCRIPT:$CLASSPATH exec java -Xmx3G -server clojure.main "$1" "$@" fi (Usually I don't have the -Xmx flag there. I add

Re: slow raw io

2010-06-23 Thread Stuart Halloway
I am seeing more like 1.8 seconds for the raw version, vs. 2.8 seconds for slurp (master branch). Can you post a complete example (including the clj script you use, and what version of Clojure), so we can be apples-to-apples? Stu > For the record, this program runs in 3.3 seconds so I guess tha

Re: slow raw io

2010-06-23 Thread cageface
For the record, this program runs in 3.3 seconds so I guess that points to the implementation of slurp: (import '[java.io BufferedReader InputStreamReader]) (let [reader (BufferedReader. (InputStreamReader. System/in)) file-data (StringBuffer.) buffer (char-array 4096)] (loop [total

slow raw io

2010-06-23 Thread cageface
Not sure if this is a clojure issue or a something else but I'm seeing surprisingly slow I/O on large text files. For example, on a unix machine try this: 1. create a large file rm -f words; for x in $(seq 300); do cat /usr/share/dict/words >> words; done 2. create a clj file that just slurps it