Re: How to extract 15/20 words around the matched query after getting results from lucene searcher?

2009-05-24 Thread KK
Thanks for your quick response, Seid. There is one more mail I found in the archive[3/4 days old] where someone asked about extracting 3 neighbors words around the match. I think once you have the position of matching term/phrase then extracting 3 or 30 neighbors wont be different, right? because

Re: how to get the word before and the word after the matched Term?

2009-05-24 Thread KK
Hi All. I want to do the same thing with say a window of 10/15. Can some one give me more details about how to do this i.e getting neighbors[both sides] of size "window", if some examples are there please point me to them/post in the mail. Also I would like to know about the term query. Is it the c

Re: How to extract 15/20 words around the matched query after getting results from lucene searcher?

2009-05-24 Thread Seid Muhie
for my thesis work (Question Answering) I used to retrieve first the document and then play with java to extract the needed answer. for your case what you will do is first locate the positions of the query terms in the document (in this case it might be distributed throughout the document - hence d

How to extract 15/20 words around the matched query after getting results from lucene searcher?

2009-05-24 Thread KK
Hi All, I'm trying to index some non-english web pages and I'm keeping all the content of the page in a single field and the searches are working fine as well. Now when I search for some query it gives the complete page, which is expected. Now I want to restrict the showing of results to say 20 wor

Re: Parsing large xml files

2009-05-24 Thread crackeur
yes, that is something worth thinking about thanks for bringing this up... - Original Message - From: "Michael Wechner" To: java-user@lucene.apache.org Sent: Friday, May 22, 2009 11:41:51 AM GMT -08:00 US/Canada Pacific Subject: Re: Parsing large xml files crack...@comcast.net

Re: Which analyzer to use for non-english unicoded text?

2009-05-24 Thread Erick Erickson
I don't think there's anything you can use out of the box, but if you search for the mail thread (see serchable archives) for a thread titled "Hebrew and Hindi analyzers" you might find something useful. Not much help I know, but perhaps a place to start. And yes, you should use the same analyzer

BoostingBooleanQuery search time is very long

2009-05-24 Thread liat oren
Hi, I have an index of 3 million documents. I perform a regular search, using an analyzer and get the results within 1-2 minutes. When I create a boostingBooleanQuery, and search within the index using a similiarity that the scorePayload return the boosting value, the search takes about 10 minutes.

Re: Searching index problems with tomcat

2009-05-24 Thread Marco Lazzara
Ok i solve the problem I've posted before,I run the web app..It creates the index in folder /home/marco/testIndex with 3 files -rw-r--r-- 1 marco marco 4043 2009-05-24 12:00 _5.cfs -rw-r--r-- 1 marco marco 58 2009-05-24 12:00 segments_c -rw-r--r-- 1 marco marco 20 2009-05-24 12:00 segments

Re: Searching index problems with tomcat

2009-05-24 Thread Marco Lazzara
Hi.At step 2 I have only 3 files in the folder,but i think is not a problema.I've tried to create the index in the web app e not only in the standalone application but something failes.Tomcat report this error java.io.FileNotFoundException: no segments* file found in org.apache.lucene.store.ramdir