Re: Scoring exact matches higher in a stemmed field

2010-07-16 Thread Shai Erera
Depends for which query no? ;) Sounds like you want to simulate the QP behavior http://lucene.apache.org/java/2_4_0/queryparsersyntax.html for boosting. Meaning, if for the query "b" you want to simulate the query "b OR b$^2" and have matches of b$ count more than b, then I'd follow how QP does it

RE: Building maven artifacts

2010-07-16 Thread Zhang, Lisheng
Hi, I never this kind of build before, but just from the error message I guess it could mean two variables: ${project.artifactId} ${project.version} are not defined (otherwise exact jar file name would be printed out)? Could it be some environment setup issue? Best regards, Lisheng -Origi

Building maven artifacts

2010-07-16 Thread Pavel Minchenkov
Hi, I'm trying to run ant task "generate-maven-artifacts" in lucene-solr build.xml file. But getting this error: /home/chardex/lucene/dev/lucene/common-build.xml:312: Error deploying artifact 'org.apache.lucene:lucene-core:jar': Error deploying artifact: File /home/chardex/lucene/dev/lucene/build/$

Scoring exact matches higher in a stemmed field

2010-07-16 Thread Itamar Syn-Hershko
Hi all, Consider the following string: "the buffalo buffaloes" [1]. When passed through a stemming analyzer, the resulting token would be "buffalo buffalo" (assuming a good stemmer). To enable exact searches, say I mark the original term and index it at the same term position. So "the buf

Re: XML results ranking

2010-07-16 Thread mark harwood
Lucene 2454 includes an example of matching logic that respects the structure in XML documents (see (https://issues.apache.org/jira/browse/LUCENE-2454 ) The example class TestNestedDocumentQuery queries xhtml marked up with hResume syntax. We don't have XQuery syntax support in a parser now (an

Re: XML results ranking

2010-07-16 Thread henok sahilu
you just have to write a parser that parse each sections of the XML document. and these documents will be indexed as a separate informational units . then the lucene ranking algorithm can over these separate sections. i can give the codes doing this thing henok good day to you - Origina

Re: XML results ranking

2010-07-16 Thread Ian Lea
Hi If you google "Lucene xml" you'll find info, but I'll attempt to answer your questions below > ... > I wonder whether Lucene: > > (1) provides full-text search over content of XML elements ? Yes. If you index the content, lucene will let you search over it. > (2) provides substring search

RE: Best practices for searcher memory usage?

2010-07-16 Thread Toke Eskildsen
On Thu, 2010-07-15 at 20:53 +0200, Christopher Condit wrote: [Toke: 140GB single segment is huge] > Sorry - I wasn't clear here. The total index size ends up being 140GB > but to try to help improve performance we build 50 separate indexes > (which end up being a bit under 3gb each) and then ope