On May 4, 2008, at 7:28 PM, DanaWhite wrote:


I arrived at this MAP by modifying IndexFiles to use a StopAnalyzer and work in a way that was acceptable for TReC files. The SearchFiles was modified
to use a StopAnalyzer and output data in a trec_eval suitable format.
Trec_eval reports about 11% at this setting.

I am not competing in TReC I am just doing an evaluation of different search
engines.

At this point I am not going to add anything to Lucene to get a higher MAP because I am trying to get a feel for its "out of the box" performance.


It's kind of tough to say what an "out of the box" experience is in Lucene, so I frankly wouldn't read to much into any numbers you arrive at on TREC. For instance, it is curious that you chose the StopAnalyzer over the more "out of the box" StandardAnalyzer. If anything were out of the box, I guess it would be, given the name, the StandardAnalyzer, but that isn't too say it will do any better, I haven't tried it. Most studies, have also shown that stemming is beneficial, but neither of those analyzers offer stemming. Remember, Lucene really is just the canvas, paint and the brushes, it's up to you to do the actual painting.

Just my advice, make sure you are comparing apples to apples, or at least as close as you can reasonably get. I think you will find that Lucene stacks up quite well.

Cheers,
Grant

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to