Re: Help with indexing and query strategy

2006-02-11 Thread rrshwrk
Hi all I have a small problem and unable to figure out how to do to it. I am unable to figure out how do i compute similarity ( similar to the score given by lucene) for a particular doucment indexed by lucene and a given query string. i know that there is a function called search which comptues

RE: Help with indexing and query strategy

2006-01-30 Thread Colin Young
ginal Message- From: Jeff Rodenburg [mailto:[EMAIL PROTECTED] Sent: 30 January, 2006 18:38 To: java-user@lucene.apache.org Subject: Re: Help with indexing and query strategy Have you considered evaluating doc-score thresholds for limiting your results? Since the perfect answers to these situa

RE: Help with indexing and query strategy

2006-01-30 Thread Colin Young
t I'm pretty sure it's not the tokenizing since I'm seeing the problem with single-word entries. Thanks for your assistance. It's been very helpful in getting me this far. Colin -Original Message- From: Rajesh Munavalli [mailto:[EMAIL PROTECTED] Sent: 30 Januar

Re: Help with indexing and query strategy

2006-01-30 Thread Jeff Rodenburg
BooleanClause.Occur.SHOULD); > } > > BooleanQuery geographyQuery = new BooleanQuery(); > if (typeToFind != "any") geographyQuery.add(entityType, > BooleanClause.Occur.MUST); > geographyQuery.add(query, BooleanClause.Occur.MUST)

Re: Help with indexing and query strategy

2006-01-30 Thread Rajesh Munavalli
For now, the best I could come up with is the following scheme SAMPLE DOCUMENTS: Lets say there are four documents: Doc1: st louis, missouri, usa Doc2: st louis du ha ha, quebec, canada Doc3: new york, NY, united states of america Doc4: ny, usa INDEX PHASE: -

RE: Help with indexing and query strategy

2006-01-27 Thread Colin Young
). 4) We don't want to return Albany unless the user has Albany in the query. Thanks again for looking at this. Colin -Original Message- From: Rajesh Munavalli [mailto:[EMAIL PROTECTED] Sent: 27 January, 2006 17:04 To: java-user@lucene.apache.org Subject: Re: Help with indexing an

Re: Help with indexing and query strategy

2006-01-27 Thread Rajesh Munavalli
Few questions. (1) Does each document contain only one geographical location? (2) Given a document, how are you tokenizing it into city, state and country? I am assuming "," as the delimiter here. Otherwise determining the boundary for names like "St. Louis du Ha Ha" would be difficult. (3) Are t

RE: Help with indexing and query strategy

2006-01-27 Thread Colin Young
uery.add(query, BooleanClause.Occur.MUST); QueryFilter filter = new QueryFilter(filterQuery); Hits hits = searcher.search(geographyQuery, filter); return hits; } -Original Message- From: Rajesh Munavalli [mailto:[EMAIL PROTECTED] Sent: 27 January, 2006 14

Re: Help with indexing and query strategy

2006-01-27 Thread Rajesh Munavalli
Hi Colin, Even assuming you came up with a good way of indexing, the example query "Ontario, CA" should yield 3 hits. All 2, 3 and 4 are valid retrievals. Could you please justify which 2 hits you want and why? Thanks, Rajesh Munavalli Colin Young wrote: I'm having some trouble comi

Re: Help with indexing and query strategy

2006-01-27 Thread Rajesh Munavalli
Hi Colin, Even assuming you came up with a good way of indexing, the example query "Ontario, CA" should yield 3 hits. All 2, 3 and 4 are valid retrievals. Could you please justify which 2 hits you want and why? Thanks, Rajesh Munavalli On 1/27/06, Colin Young <[EMAIL PROTECTED]> wrote: >

Help with indexing and query strategy

2006-01-27 Thread Colin Young
I'm having some trouble coming up with a good search strategy for geographical data. e.g., given: [1] city: London, United Kingdom [2] city: London, Ontario, Canada [3] city: Ontario, California, United States [4] state: Ontario, Canada [5] city: Vancouver, Washington, United States [6] city: Va