Re: highlighting searched results in document

2009-05-27 Thread KK
Yes, the getBestFragment() returns the matched fragment "fragmentcount" numbers each separated with the "fragmentseparator". what exactly you mean by "highlight the searched word in the document." what is this document??? first let us know what exactly you want to with the search results. --KK On

Re: highlighting searched results in document

2009-05-27 Thread Ritu choudhary
The output i want is to show the text i searched let's say "search" in the document where it occured. Like this: the text i want to search is this and it should be highlighted as this . Search currently is giving just the word "search". The result i want is to highlight the searched word in the d

Re: highlighting searched results in document

2009-05-27 Thread KK
what exactly is your requirement? Displaying the final search results in a webpage? or anything else. The results that you are getting is correct. Now you have to decide what you want to do with that. I thought you are trying to show the results in a webpage. --KK On Thu, May 28, 2009 at 11:54 AM

Re: relevance function for scores

2009-05-27 Thread kenny kim
Hi, Joel. You are right. I've been trying to find a method to reduce search time by filtering out docs before calculating it at the run-time. It is a little bit different from yours. But I think your approach might be helpful to improve search quality of my search apps if it works enough sp

Re: highlighting searched results in document

2009-05-27 Thread Ritu choudhary
no i am doing it on eclipse ganymede On 28/05/2009, KK wrote: > Forgot: > Are you trying all this from command line? Because thats wehn you get the > ouput as unprocessed html , those span tags, when you pass the same to > display the content as a webpage they will be processed by the browser and

Re: highlighting searched results in document

2009-05-27 Thread KK
Forgot: Are you trying all this from command line? Because thats wehn you get the ouput as unprocessed html , those span tags, when you pass the same to display the content as a webpage they will be processed by the browser and you will see the colored matches. --KK On Thu, May 28, 2009 at 11:49

Re: highlighting searched results in document

2009-05-27 Thread KK
Yes , thats the expected output. Now put that full content[whatever the searcer returned] in the html page alongwith the styling for the same, and you will see the matches in yellow [you chose yellow as color for highlighting]. --KK On Thu, May 28, 2009 at 11:42 AM, Ritu choudhary wrote: > I h

Re: highlighting searched results in document

2009-05-27 Thread Ritu choudhary
I have added the lines you suggested and now its giving the following output , still can't get what's wrong... THE CHANGES I HAVE DONE: SimpleHTMLFormatter formatter = new SimpleHTMLFormatter("", ""); Highlighter highlighter = new Highlighter(formatter, new QueryScorer(query

Re: Top N Phrases in subset of documents

2009-05-27 Thread Preetham Kajekar
http://stackoverflow.com/questions/195434/how-can-i-get-top-terms-for-a-subset-of-documents-in-a-lucene-index tomm...@aim.com wrote: Hi All, I need to determine top words/phrases in my documents, and?currently using the ShingleAnalyzerWrapper for indexing. Through Luke it seems the top terms

Re: highlighting searched results in document

2009-05-27 Thread KK
Yes, your code is wrong! Where is the highlighter span/formatter, because from your code what I can see is that you are just passsing the score to Queryscorer, instead you should pass both queryscore as well as formatter >From my previous mail you can see the following code and mimic the same and i

Re: Using JBoss Cache as directory for Apache Lucene

2009-05-27 Thread Artyom Sokolov
Well, I'm just playing with these and trying to create distributed search engine which could store index in JBoss Cache (data grid) and manage index with GridGain (compute grid) intelligently. Consider one has really huge index(es) to search (tens or hundreds of gigabytes). One idea is to distribut

Re: Searching index problems with tomcat

2009-05-27 Thread N Hira
Cool! 1. So you are creating a parser with { name, synonyms, propIn }, correct? 2. Sorry -- I meant the output of "query.toString()"; I'm expecting to see something like this when the sentence parameter is set to philipcimiano: name:philipcimiano synonyms:philipcimiano propIn:philipcimian

Re: Searching index problems with tomcat

2009-05-27 Thread Marco Lazzara
I've made a bad copy-paste. this is the full class The output of philipcimiano is ex#pub1-author-ex#res2-name-philipcimiano I've made a bad copy-paste. this is the full class public class RDFinder { private Analyzer analyzer; private Directory directory; private IndexSearcher isearc

Re: Searching index problems with tomcat

2009-05-27 Thread N. Hira
Hi there. Perhaps I'm misreading this, but you are not using the "Field" parameter for query construction, are you? In other words, the default field used to construct the QueryParser is what's being used for your query, correct? Could you post: 1. The code used to construct the QueryPa

Re: Searching index problems with tomcat

2009-05-27 Thread Marco Lazzara
public TreeMap> Search(String sentence, String Field) throws ParseException, IOException{ query = parser.parse(sentence); try { FileWriter fw = new FileWriter ("paths"); BufferedWriter bw = new BufferedWriter (fw); outFile = new PrintWriter (bw);

Re: Searching index problems with tomcat

2009-05-27 Thread N. Hira
Thanks. Could you also post the code for RDFinder.Search() and the output from query.toString() when text is "PHILIPCIMIANO"? -h On 27-May-2009, at 12:40 PM, Marco Lazzara wrote: String[] fieldsearch = new String[] {"name", "synonyms", "propIn"}; RDFinder rdfind = new RDFinder("/home/m

Re: Searching index problems with tomcat

2009-05-27 Thread Marco Lazzara
String[] fieldsearch = new String[] {"name", "synonyms", "propIn"}; RDFinder rdfind = new RDFinder("/home/marco/testIndex",fieldsearch); try { this.paths = this.rdfind.Search(text, "path"); } catch (ParseException e1) { e1.printStackTrace();

Re: Searching index problems with tomcat

2009-05-27 Thread N. Hira
Okay -- that helps. So we know that searching the same files with Luke works, but with the web app does not. Can you please re-post the fragment of code that opens your index and uses the query? If you haven't already done this, could you also use query.toString() to confirm the query?

Top N Phrases in subset of documents

2009-05-27 Thread tommyha
Hi All, I need to determine top words/phrases in my documents, and?currently using the ShingleAnalyzerWrapper for indexing. Through Luke it seems the top terms are correct for the whole index. Is it possible to determine the top terms for?a subset of documents in the index?? Or do I need to?cre

Re: Searching index problems with tomcat

2009-05-27 Thread Marco Lazzara
NO.the app creates the index in a folder and I run the query in that folder. For example if I decide to create the folder in /home/marco/testIndex ,obviously I run the query on /home/marco/testIndex if I decide to create the folder in /home/marco/RDFLUCENE ,obviously I run the query on /home/marc

Re: highlighting searched results in document

2009-05-27 Thread Ritu choudhary
Am i coding it wrongly ...please reply.

Re: Searching index problems with tomcat

2009-05-27 Thread N. Hira
Okay -- if the problem is not the number of results, then let's clarify the problem: 1. You create an index in something like: /home/marco/testIndex 2. You copy over the directory to something like: /home/marco/RDFIndexLucene 3. When you run Tomcat, your "searcher" tries to

Re: Searching index problems with tomcat

2009-05-27 Thread Marco Lazzara
In my app I obtain 3 results.But I think is not a problem Marco Lazzara 2009/5/27 Erick Erickson > StandardAnalyzer is fine. I loaded your index into Luke and there is > exactly > one document with philipcimiano in the name field. > There is only one document that has researcher in the name fie

Re: Searching index problems with tomcat

2009-05-27 Thread Erick Erickson
StandardAnalyzer is fine. I loaded your index into Luke and there is exactly one document with philipcimiano in the name field. There is only one document that has researcher in the name field. Both of these documents (using StandardAnalyzer) return one document (doc 12 for PHILIPCIMIANO and doc 4

Re: Apache Lucene Crawler search

2009-05-27 Thread Mark Miller
Lucene is more like a search utility library than a full blown Search Engine like FAST. The Lucene sub project, Solr is more comparable to FAST, but Solr does not have a built in crawler available either (though its easy enough to do basic crawls). There are many open source crawlers you could

Re: Searching index problems with tomcat

2009-05-27 Thread N. Hira
Not sure if this applies here, but that tends to happen when the analyzer you use for indexing is different from the one used in Luke or you're running into character set issues. Are you using the StandardAnalyzer in both cases? Also, could you post an example of the query you are trying?

Re: relevance function for scores

2009-05-27 Thread Joel Halbert
I'm not certain, without testing it. I think you and I may have slightly orthogonal needs. From what I gather you are looking to speed up your search time (by filtering out irrelevant results), whereas I am simply looking to increase the relevancy of the results presented to the users when they gr

Re: No hits while searching!

2009-05-27 Thread Erick Erickson
The most common issue with this kind of thing is that UN_TOKENIZEDimplies no case folding. So if your case differs you won't get a match. That aside, the very first thing I'd do is get a copy of Luke (google Lucene Luke) and examine the index to see if what's in your index is what you *think* is i

Re: Using JBoss Cache as directory for Apache Lucene

2009-05-27 Thread Erick Erickson
Warning: I'm almost completely ignorant of JBoss Cache andGridGain. But it would be useful if you could tell us *why* you want to do this. If it's a question of speeding up Lucene queries, there are a number of things that you can do with Lucene itself that may be more appropriate, but without kno

Re: relevance function for scores

2009-05-27 Thread kenny kim
I seems to be a good solution. However, I think it may takes some processing time to get the distribution of all matching documents before scoring each docs. Would you have a good idea to get the distributions less than some reasonable time? On 2009. 05. 26, at 오후 8:15, Joel Halbert wrote

Re: Apache Lucene Crawler search

2009-05-27 Thread Michael McCandless
Have a look at Apache droids? http://incubator.apache.org/droids/ Mike On Wed, May 27, 2009 at 5:37 AM, gnixinfosoft wrote: > > How to implement crawler search in Apache Lucene, >> >> I am currently using FAST search engine in my project, which uses crawler >> facility >> >> How to implemen

Apache Lucene Crawler search

2009-05-27 Thread gnixinfosoft
How to implement crawler search in Apache Lucene, > > I am currently using FAST search engine in my project, which uses crawler > facility > > How to implement this using Apache Lucene, I read somewhere that there is > no > direct functionality to this in Apache Lucene, but we can implement it > u

Re: highlighting searched results in document

2009-05-27 Thread Ritu choudhary
Thank you so much for your patience and support but i am still not getting the correct result. Here is my code can you please tell me what wrong have i done in it? (I don't want to use org.apache.search.hit so i have used terms in place of that) package highlighted; import java.io.FileWriter; i

Re: highlighting searched results in document

2009-05-27 Thread KK
@Ritu Wouter's reply must have fixed the problem, right? Or still stuck? --KK On Wed, May 27, 2009 at 1:46 PM, Wouter Heijke wrote: > Hi, > It sounds to me that you are highlighting the query string and not the > document. You will have to pass the document's content to > getBestFragments() and

Re: highlighting searched results in document

2009-05-27 Thread Wouter Heijke
Hi, It sounds to me that you are highlighting the query string and not the document. You will have to pass the document's content to getBestFragments() and it will work I think. Wouter > hi there, > I am using lucene highlighter to highlight the searched result > but it shows only the query s

Re: highlighting searched results in document

2009-05-27 Thread Ritu choudhary
I want to confirm the output of the below statement , what i get into "result" is just the word i am searching (let's say d word is registered). How can i get the whole fragment in which the word is found and show the highlighted word in that fragment or document. String result = highlighte

Re: Searching index problems with tomcat

2009-05-27 Thread Marco Lazzara
* I see that you have reported the creation of 3 files, but does Luke recognize those files as an index and do you see the Documents you expect to see in this index?* Luke recognizes those files and I see those documents in this index but I observed that when I run the query Luke finds (for example