Re: Getting all terms from Index

2005-07-01 Thread Erik Hatcher
On Jul 1, 2005, at 6:58 PM, Yousef Ourabi wrote: Hello all, I am getting odd results, documents without the keyword I search for are being returned. You really have to provide some details for this list to be helpful. I want to get all the indexed (searchable) inside the index, then create

Getting all terms from Index

2005-07-01 Thread Yousef Ourabi
Hello all, I am getting odd results, documents without the keyword I search for are being returned. I want to get all the indexed (searchable) inside the index, then create queries for those words with explaining (explaination object) set to see what matches to what and why? Thanks in advance, You

Re: UTF-8 indexing and searching

2005-07-01 Thread Paul Libbrecht
Careful that in the http world, there's an amibuity: x-www-form-url-encoded does not specify the content-encoding that the byts represented in the %-escaped sequences are written with. That's fixed by the very recent URI spec where absence means utf-8... My experience was that Tomcat simply con

Re: UTF-8 indexing and searching

2005-07-01 Thread pierre.conti
Did you check that the request string you get at the analyzer level is corectly encoded as UTF-8? We had the same problem with french accentuated char encoded also as UTF-8, and transmited by tomcat as ISO-8859-1. It was just for a test, also we didn't investgated a lot, but re-encode in URL/ISO-8

Re: Sentence and Paragraph searching

2005-07-01 Thread Paul Elschot
On Friday 01 July 2005 20:52, McCallie,David wrote: > > Couldn't you use SpanQuery for something like this? Put special > and tokens around each sentence, > and then search for the specific key words inside of the outer SPAN? Do > the same for paragraphs, sections, etc. > > I tried this once,

Re: Sentence and Paragraph searching

2005-07-01 Thread Paul Elschot
On Friday 01 July 2005 21:13, Dave Kor wrote: > Quoting Peter Laurinc <[EMAIL PROTECTED]>: > > > Hi, > > > > I'm newbie to lucene. > > I wan to ask, how to implement search for phrase that must be in > > sentence/paragraph. > > I did see som examples, that uses term position changing, but I think

Re: Vedr. Re: Design question [too many fields?]

2005-07-01 Thread markharw00d
about 4900 room units which I think is OK as far as Still we have optimization work to do. Assuming your availability is a year in advance and yours is a reputable chain of hotels that books rooms by the day, (not the hour!) You only need: 4900 * 365 bits of true/false info to cache all the ava

Vedr. Re: Design question [too many fields?]

2005-07-01 Thread Naimdjon Takhirov
Guys, thanks for your inputs. I think the solution Mark has suggested does solves the problem in an acceptable way. Its actually gonna be a little better than the solution the customer is has right now. Apart from the availability we have to also check if there is any price for room units saved in

Re: Sentence and Paragraph searching

2005-07-01 Thread Dave Kor
Quoting Peter Laurinc <[EMAIL PROTECTED]>: > Hi, > > I'm newbie to lucene. > I wan to ask, how to implement search for phrase that must be in > sentence/paragraph. > I did see som examples, that uses term position changing, but I think > that this is not the way, because it breaks classic proximit

RE: Sentence and Paragraph searching

2005-07-01 Thread McCallie,David
Couldn't you use SpanQuery for something like this? Put special and tokens around each sentence, and then search for the specific key words inside of the outer SPAN? Do the same for paragraphs, sections, etc. I tried this once, and it seemed to work. I'm not sure of the performance penalty of

Re: Design question [too many fields?]

2005-07-01 Thread Chris Lu
Erik, Mark and Naimdjon, Sorry I totally misunderstood the question, of multiple dates for a Document. I came to agree with Erik and Mark on this problem. My head was thinking to find a generic solution to Lucene's limitation: The TooManyClauses problem when using RangeQuery and there are more th

RE: Sentence and Paragraph searching

2005-07-01 Thread Peter Laurinc
Maybe the solution is have to each term not only position but also something like vector. Then you can "vectorize it": term 1 has vector 1, 1 term 2 has vector 1, 1 (1 paragraph, 1 sentence of this paragraph) , term 3 has (1, 2) if you set query for searching in paragraph/sentence you only set what

UTF-8 indexing and searching

2005-07-01 Thread Faulkner, Jeffrey
I'm trying to index and search html and jsp files that are saved using utf-8 encoding. The pages are indexed on the file system using the StandardAnalyzer. The files can contain a mix of english, chinese, japanese, etc. saved as utf-8. Searches using english terms are successful but none of the

Re: Sentence and Paragraph searching

2005-07-01 Thread Erik Hatcher
On Jul 1, 2005, at 8:16 AM, Peter Laurinc wrote: Hi, I'm newbie to lucene. I wan to ask, how to implement search for phrase that must be in sentence/paragraph. I did see som examples, that uses term position changing, but I think that this is not the way, because it breaks classic proximity se

RE: free text search with numbers

2005-07-01 Thread BOUDOT Christian
Thanks for the link. Cheers Chris -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: 01 July 2005 15:11 To: java-user@lucene.apache.org Subject: Re: free text search with numbers On Jul 1, 2005, at 8:06 AM, BOUDOT Christian wrote: > It is the first time that I imple

Re: free text search with numbers

2005-07-01 Thread Erik Hatcher
On Jul 1, 2005, at 8:06 AM, BOUDOT Christian wrote: It is the first time that I implement a search with Lucene, so please don't laugh if my question seam trivial. When I enter some text in my free text search the query gets build correctly but when I enter number (as string) the query parse

Sentence and Paragraph searching

2005-07-01 Thread Peter Laurinc
Hi, I'm newbie to lucene. I wan to ask, how to implement search for phrase that must be in sentence/paragraph. I did see som examples, that uses term position changing, but I think that this is not the way, because it breaks classic proximity search. (if one word is on end and second of begining

free text search with numbers

2005-07-01 Thread BOUDOT Christian
Hi folks, It is the first time that I implement a search with Lucene, so please don't laugh if my question seam trivial. When I enter some text in my free text search the query gets build correctly but when I enter number (as string) the query parser seam to ignore them. What am I doing wrong?

Re: Does highlighter highlight phrases only?

2005-07-01 Thread Erik Hatcher
On Jun 30, 2005, at 4:35 PM, markharw00d wrote: Hi Erik, Yes I was thinking that code could form the basis of a new highlighter. I've just attached a QuerySpansExtractor to the bugzilla entry for the new highlighter. This class produces Spans from queries other than SpanXxxxQueries eg p

Re: Design question [too many fields?]

2005-07-01 Thread Erik Hatcher
On Jun 30, 2005, at 11:27 PM, Chris Lu wrote: Mark, your suggestion will incur another trip to the database. And if the search results is large, filtering in DB by pk is not really good. Chris - I disagree with that last comment. It can be a great solution when the filter is cached. Cer

[Project-advertisement] myDbSearcher.

2005-07-01 Thread Harald Stowasser
Hello lucene list readers. We have decided to put my MySQL-Lucene-search-engine under the GPL. Maybe you can use it: http://sourceforge.net/projects/mydbsearcher/ First implementation is running there: http://www.idowa.de/ueberblick/suche/index_html (German Newspaper) And it would be nice, if an

Re: Vedr. Re: Design question [too many fields?]

2005-07-01 Thread Chris Lu
> It is anyway going to be too many fields then? Days of > year for the whole year ahead? Since the fromDate and > toDate can be across two months and the customer wants > the data be available for one year. It won't have too many fields. > > My suggestion is, use "year" + "month" + "day" three >