Re: contrib Benchmark enwiki problem

2009-01-24 Thread Grant Ingersoll
On Jan 23, 2009, at 3:36 PM, Michael McCandless wrote: I think temp is for downloading X.gz and un-gzipping it, and then X is supposed to get unpacked/moved into work. I think? Yep. I'm not married to it, so we can change it. I think the key thing for me is you want to make sure you

Re: contrib Benchmark enwiki problem

2009-01-23 Thread Michael McCandless
I think temp is for downloading X.gz and un-gzipping it, and then X is supposed to get unpacked/moved into work. I think? work also holds the index subdir by default. I'd suggest moving your massive Wikipedia XML file to somewhere "safe" (ie, not in "temp") and then symlinking from work t

Re: contrib Benchmark enwiki problem

2009-01-23 Thread Jason Rutherglen
I did the symbolic link from work to temp and things worked. Perhaps benchmark should download directly to work? What is temp for? On Thu, Jan 22, 2009 at 4:58 AM, Grant Ingersoll wrote: > There is a little funkiness in the ant script there in that if the original > file exists in temp, but has

Re: contrib Benchmark enwiki problem

2009-01-22 Thread Grant Ingersoll
There is a little funkiness in the ant script there in that if the original file exists in temp, but hasn't been processed in work, then it doesn't do the proper thing. The workaround is to do the second step to get into work by hand. I believe there is a JIRA issue on it. Also, I highly

Re: contrib Benchmark enwiki problem

2009-01-22 Thread Michael McCandless
An "alg" is simply a file (file.alg) that the benchmarking code runs. You run it something like this: ant run-task -Dtask.alg=/path/to/file.alg -Dtask.mem=1024M For docs... there's the package.html in contrib/benchmark. LIA 2 (only via MEAP right now) also covers benchmark's alg syntax

Re: contrib Benchmark enwiki problem

2009-01-21 Thread Jason Rutherglen
The xml file temp/enwiki-20070527-pages-articles.xml was downloaded by "ant get-enwiki expand-enwiki". The docs.file in extractWikipedia.alg and wikipedia.alg points to it. The error message is regarding work/enwiki.txt. Is there a how to on this stuff? What is an alg? On Wed, Jan 21, 2009 at

Re: contrib Benchmark enwiki problem

2009-01-21 Thread Michael McCandless
You should download Wikipedia's XML file manually yourself, uncompress it, and then edit docs.file in that alg to point to it. Mike Jason Rutherglen wrote: I downloaded trunk via SVN. Went to trunk/contrib/benchmark. Executed ant enwiki. I'm not sure what else needs to be done. Receiv