There is a little funkiness in the ant script there in that if the
original file exists in temp, but hasn't been processed in work, then
it doesn't do the proper thing. The workaround is to do the second
step to get into work by hand. I believe there is a JIRA issue on it.
Also, I highly recommend copying the original file off to somewhere
else on your system and soft linking to it so that you don't have to
download it again in case you accidentally delete that directory.
-Grant
On Jan 22, 2009, at 6:01 AM, Michael McCandless wrote:
An "alg" is simply a file (file.alg) that the benchmarking code
runs. You run it something like this:
ant run-task -Dtask.alg=/path/to/file.alg -Dtask.mem=1024M
For docs... there's the package.html in contrib/benchmark. LIA 2
(only via MEAP right now) also covers benchmark's alg syntax in an
appendix.
But you are close... oh, it actually looks like the output file
(work/enwiki.txt) could not be written. Does that directory (../
work) exist? (I think build.xml should have created it).
Mike
Jason Rutherglen wrote:
The xml file temp/enwiki-20070527-pages-articles.xml was downloaded
by "ant
get-enwiki expand-enwiki". The docs.file in extractWikipedia.alg and
wikipedia.alg points to it. The error message is regarding
work/enwiki.txt.
Is there a how to on this stuff? What is an alg?
On Wed, Jan 21, 2009 at 5:03 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:
You should download Wikipedia's XML file manually yourself,
uncompress it,
and then edit docs.file in that alg to point to it.
Mike
Jason Rutherglen wrote:
I downloaded trunk via SVN. Went to trunk/contrib/benchmark.
Executed
ant
enwiki. I'm not sure what else needs to be done. Received this
error:
enwiki:
[echo] Working Directory:
/Users/jrutherg/dev/lucenetrunk/trunk/contrib/benchmark/work
[java] Running algorithm from:
/Users/jrutherg/dev/lucenetrunk/trunk/contrib/benchmark/conf/
extractWikipedia.alg
[java] ------------> config properties:
[java] doc.maker =
org.apache.lucene.benchmark.byTask.feeds.EnwikiDocMaker
[java] doc.maker.forever = false
[java] docs.file = temp/enwiki-20070527-pages-articles.xml
[java] line.file.out = work/enwiki.txt
[java] work.dir = work
[java] -------------------------------
[java] ------------> algorithm:
[java] Seq_Exhaust {
[java] WriteLineDoc
[java] > * EXHAUST
[java]
[java] Error: cannot execute the algorithm! work/enwiki.txt (No
such
file or directory)
[java] ####################
[java] java.io.FileNotFoundException: work/enwiki.txt (No such
file or
directory)
[java] ### D O N E !!! ###
[java] at java.io.FileOutputStream.open(Native Method)
[java] ####################
[java] at
java.io.FileOutputStream.<init>(FileOutputStream.java:179)
[java] at
java.io.FileOutputStream.<init>(FileOutputStream.java:70)
[java] at
org
.apache
.lucene
.benchmark
.byTask.tasks.WriteLineDocTask.setup(WriteLineDocTask.java:63)
[java] at
org
.apache
.lucene
.benchmark.byTask.tasks.PerfTask.runAndMaybeStats(PerfTask.java:83)
[java] at
org
.apache
.lucene
.benchmark
.byTask.tasks.TaskSequence.doSerialTasks(TaskSequence.java:141)
[java] at
org
.apache
.lucene
.benchmark.byTask.tasks.TaskSequence.doLogic(TaskSequence.java:122)
[java] at
org
.apache
.lucene
.benchmark.byTask.tasks.PerfTask.runAndMaybeStats(PerfTask.java:92)
[java] at
org
.apache
.lucene.benchmark.byTask.utils.Algorithm.execute(Algorithm.java:
246)
[java] at
org
.apache.lucene.benchmark.byTask.Benchmark.execute(Benchmark.java:
73)
[java] at
org.apache.lucene.benchmark.byTask.Benchmark.main(Benchmark.java:
109)
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
--------------------------
Grant Ingersoll
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org