Hi all
I have a small problem and unable to figure out how to do to it.
I am unable to figure out how do i compute similarity ( similar to the score
given by lucene) for a particular doucment indexed by lucene
and a given query string. i know that there is a function called search
which comptues
ginal Message-
From: Jeff Rodenburg [mailto:[EMAIL PROTECTED]
Sent: 30 January, 2006 18:38
To: java-user@lucene.apache.org
Subject: Re: Help with indexing and query strategy
Have you considered evaluating doc-score thresholds for limiting your
results? Since the perfect answers to these situa
t I'm pretty sure it's not the tokenizing since I'm seeing
the problem with single-word entries.
Thanks for your assistance. It's been very helpful in getting me this
far.
Colin
-Original Message-
From: Rajesh Munavalli [mailto:[EMAIL PROTECTED]
Sent: 30 Januar
BooleanClause.Occur.SHOULD);
> }
>
> BooleanQuery geographyQuery = new BooleanQuery();
> if (typeToFind != "any") geographyQuery.add(entityType,
> BooleanClause.Occur.MUST);
> geographyQuery.add(query, BooleanClause.Occur.MUST)
For now, the best I could come up with is the following scheme
SAMPLE DOCUMENTS:
Lets say there are four documents:
Doc1: st louis, missouri, usa
Doc2: st louis du ha ha, quebec, canada
Doc3: new york, NY, united states of america
Doc4: ny, usa
INDEX PHASE:
-
).
4) We don't want to return Albany unless the user has Albany in the
query.
Thanks again for looking at this.
Colin
-Original Message-
From: Rajesh Munavalli [mailto:[EMAIL PROTECTED]
Sent: 27 January, 2006 17:04
To: java-user@lucene.apache.org
Subject: Re: Help with indexing an
Few questions.
(1) Does each document contain only one geographical location?
(2) Given a document, how are you tokenizing it into city, state and
country? I am assuming "," as the delimiter here. Otherwise determining the
boundary for names like "St. Louis du Ha Ha" would be difficult.
(3) Are t
uery.add(query, BooleanClause.Occur.MUST);
QueryFilter filter = new QueryFilter(filterQuery);
Hits hits = searcher.search(geographyQuery, filter);
return hits;
}
-Original Message-
From: Rajesh Munavalli [mailto:[EMAIL PROTECTED]
Sent: 27 January, 2006 14
Hi Colin,
Even assuming you came up with a good way of indexing, the
example query "Ontario, CA" should yield 3 hits. All 2, 3 and 4 are
valid retrievals. Could you please justify which 2 hits you want and why?
Thanks,
Rajesh Munavalli
Colin Young wrote:
I'm having some trouble comi
Hi Colin,
Even assuming you came up with a good way of indexing, the example
query "Ontario, CA" should yield 3 hits. All 2, 3 and 4 are valid
retrievals. Could you please justify which 2 hits you want and why?
Thanks,
Rajesh Munavalli
On 1/27/06, Colin Young <[EMAIL PROTECTED]> wrote:
>
I'm having some trouble coming up with a good search strategy for geographical
data. e.g., given:
[1] city: London, United Kingdom
[2] city: London, Ontario, Canada
[3] city: Ontario, California, United States
[4] state: Ontario, Canada
[5] city: Vancouver, Washington, United States
[6] city: Va
11 matches
Mail list logo