Re: highlighting searched results in document

2009-05-26 Thread KK
Hi , AFAIK, the default option is to bold the matched text. If you want to do something else, say highlight it with some color then you have to do that instead of doing the default bolding. The following is a working example from LIA2ndEdn, [verbatim copy] for hit highlighting. import java.io.*; i

highlighting searched results in document

2009-05-26 Thread Ritu choudhary
hi there, I am using lucene highlighter to highlight the searched result but it shows only the query string in bold highlights. IS THERE ANY WAY I CAN USE IT TO SHOW THE HIGHLIGHTED TEXT IN THE DOCUMENT WHERE IT IS FOUND? I need to show the searched terms in highlights in the document where it

No hits while searching!

2009-05-26 Thread vanshi
In my web application, I need search functionality on first name and last name in 2 different ways, one search must be based on 'Metaphone Analyzer' giving all similar sounding names as result and another search should be exact match on either first name or last name. The name sounds like search h

Re: Searching index problems with tomcat

2009-05-26 Thread N Hira
Sorry for the confusion -- I checked the archive and I could not find a message where you have been able to open the index using Luke. Have you been able to do that? I see that you have reported the creation of 3 files, but does Luke recognize those files as an index and do you see the Docume

Using JBoss Cache as directory for Apache Lucene

2009-05-26 Thread Artyom Sokolov
Hello. Has anyone tried to store Lucene index in JBoss Cache? Are there any good implementations of Lucene Directory for it? I found http://viewvc.jboss.org/cgi-bin/viewvc.cgi/jbosscache/jbosscache-lucene/jbosscache/src/java/org/apache/lucene/store/jbosscache/ but I can't find any documentation or

Re: Searching index problems with tomcat

2009-05-26 Thread Marco Lazzara
*Does the part of the web app that is responsible for searching have permissions to read "/home/marco/testIndex"?* Yes It does.It can read everywhere. *Could you add some code to your searching app to print out the directory listing to confirm?* I've already posted them.See May 19 *Also, I may

Re: Searching index problems with tomcat

2009-05-26 Thread N Hira
Marco, Does the part of the web app that is responsible for searching have permissions to read "/home/marco/testIndex"? Could you add some code to your searching app to print out the directory listing to confirm? Also, I may have missed this posting, but could you provide the answer from Ste

Re: Searching index problems with tomcat

2009-05-26 Thread Marco Lazzara
I tried different things.I tried to create the index without the web application,I tried to create the index with a webapp and the index was created without any problem.But the research has alway no result. For example,if the folder i'm searching on is empty, the webapp cathces an exception : "no

Re: How to extract 15/20 words around the matched query after getting results from lucene searcher?

2009-05-26 Thread Grant Ingersoll
On May 25, 2009, at 1:34 AM, KK wrote: Also people are talking about someting called spanQueries/ termvectors etc to use for this purpose. I'm still to get the exact idea of how to do this. I just blogged up a quick little demo (including full code) of this at http://www.lucidimagination.c

PNW Hadoop + Apache Cloud Stack Meetup, Wed. May 27th:

2009-05-26 Thread Bradford Stephens
Greetings, This is a friendly reminder that the 1st meetup for the PNW Hadoop + Apache Cloud Stack User Group is THIS WEDNESDAY at 6:45pm. We're very excited to have everyone attend! University of Washington, Allen Center Room 303, at 6:45pm on Wednesday, May 27, 2009. I'm going to put together a

Re: Most frequently indexed term

2009-05-26 Thread Preetham Kajekar
Have a look at http://stackoverflow.com/questions/195434/how-can-i-get-top-terms-for-a-subset-of-documents-in-a-lucene-index (I have not tried the above out) Ganesh wrote: Hello All, I need to build some stats. I need to know Top 5 frequently indexed term in a date range (In a day or a Month)

Re: Hit highlighting for non-english unicode index/queries not working?

2009-05-26 Thread Erick Erickson
LowercaseFilter is part of Lucene, as are any number of other filters. Thebasic idea is just that *after* tokenization, there may be further transformations you want to do on each token, such as lower-casing it, stemming it, skipping it, But watch out a bit, there are token Filters and search

Re: Hit highlighting for non-english unicode index/queries not working?

2009-05-26 Thread KK
Thank you Erick. As of now I'm using whitespaceanalyzer and no stemming and not stop word remova. Now I feel writing a simple analyzer won't be that difficult after going thru your mail. I'll give it a try. I don't have any idea on filters but I'm pretty it must be simple and will definitely go thr

Re: Hit highlighting for non-english unicode index/queries not working?

2009-05-26 Thread Erick Erickson
It's fairly easy to construct your own analyzer bystringing together some filters and tokenizers. LIA (1st ed) had a SynonymAnalyzer. You probably want something like (WARNING, example only, I'm not even sure it compiles!! Ripped off from the WIKI) public class MyAnalyzer extends Analyzer { p

Re: Searching index problems with tomcat

2009-05-26 Thread Matthew Hall
Right.. so perhaps I'm a bit confused here. The webapp.. is consuming an index.. yes? Or, are you trying to create an index via a webapp? I was assuming that you had some sort of indexing software that you were using to first build your indexes, which the webapp then consumes. Is that your i

Re: how to get the word before and the word after the matched Term?

2009-05-26 Thread KK
Thank you very much @ Grant. I used the whitespaceanalyzer and other highlighter methods provided for all unicoded docs and its working fine. Thank you all. The book LIA2ndEdn helped me a lot specifically the examples in the highlighting section. Thanks, KK. On Tue, May 26, 2009 at 4:43 PM, Gra

Re: BoostingBooleanQuery search time is very long

2009-05-26 Thread liat oren
It is a booleanQuery that uses the boosting: I created a Similiarity class that returns the payload and I create the query using the following way: BooleanQuery bq = new BooleanQuery(); String[] splitWorlds = worlds.split(" "); for(int i = 0; i < splitWorlds.length; i++) { if(wordsWorlds

Re: relevance function for scores

2009-05-26 Thread Joel Halbert
Yes, something like this might work, although rather than having a cutoff determined by the difference between two successive document scores (Doc(n) and Doc(n-1)) I was thinking of using a function which looked at the distribution of the scores of all matching documents. Since I just want to exclu

Re: how to get the word before and the word after the matched Term?

2009-05-26 Thread Grant Ingersoll
On May 25, 2009, at 4:35 AM, KK wrote: One more information I would like to add, # I'm building index mostly for non-english texts/documents. and searching is done using unicode utf-8 texts[its obivious, right?] Yes, searching should be fine. Thanks KK On Mon, May 25, 2009 at 10:58 A

Re: how to get the word before and the word after the matched Term?

2009-05-26 Thread Grant Ingersoll
On May 25, 2009, at 1:28 AM, KK wrote: Hi All. I want to do the same thing with say a window of 10/15. Can some one give me more details about how to do this i.e getting neighbors[both sides] of size "window", if some examples are there please point me to them/post in the mail. Also I would

Re: BoostingBooleanQuery search time is very long

2009-05-26 Thread Grant Ingersoll
What's a BoostingBooleanQuery? On May 24, 2009, at 7:09 AM, liat oren wrote: Hi, I have an index of 3 million documents. I perform a regular search, using an analyzer and get the results within 1-2 minutes. When I create a boostingBooleanQuery, and search within the index using a similiari

Re: New user in lucene

2009-05-26 Thread KK
Hi Stanley, As Alexander correctly mentioned, you should first try your hands with Solr[which uses Lucene at the backend] keeping you away/free from all the nitty-gritty. First get the Solr running, making sure your are able to post/index documents in it and also able to search it from the solr adm

Re: New user in lucene

2009-05-26 Thread Alexander Aristov
As I said try Solr. It has web interface which can give you ideas. There is no reason to complete it here in email. If you know Java you will quickly catch up with. Best Regards Alexander Aristov 2009/5/26 StanleyTan > > > Hi Alexander, > > thanks for your advise. but im thinking how am i supp

Re: Hit highlighting for non-english unicode index/queries not working?

2009-05-26 Thread KK
Thank you @Muir. I was earlier using simpleanalyzer for all purposes but as you reccomended me the whitespace one, I tried to use that analyzer and good thing is that I'm able to index/search non-english text as well as supporting hit highlighting for these non-english texts. Thank you very much. B

Re: New user in lucene

2009-05-26 Thread StanleyTan
Hi Alexander, thanks for your advise. but im thinking how am i suppose to integrate in? because i tried to google, and after d/ling the zip folder from lucene site. i see the files, i do not know how to integrate in. oh man, im lost. sorry, but can u guide me on how to integrate into html page