Yes, the getBestFragment() returns the matched fragment "fragmentcount"
numbers each separated with the "fragmentseparator".
what exactly you mean by "highlight the searched word in the document." what
is this document???
first let us know what exactly you want to with the search results.
--KK
On
The output i want is to show the text i searched let's say "search" in the
document where it occured. Like this:
the text i want to search is this and it should be highlighted as this .
Search currently is giving just the word "search". The result i want is to
highlight the searched word in the d
what exactly is your requirement?
Displaying the final search results in a webpage? or anything else.
The results that you are getting is correct. Now you have to decide what you
want to do with that.
I thought you are trying to show the results in a webpage.
--KK
On Thu, May 28, 2009 at 11:54 AM
Hi, Joel.
You are right. I've been trying to find a method to reduce search time
by filtering out docs before calculating it at the run-time. It is a
little bit different from yours.
But I think your approach might be helpful to improve search quality
of my search apps if it works enough sp
no i am doing it on eclipse ganymede
On 28/05/2009, KK wrote:
> Forgot:
> Are you trying all this from command line? Because thats wehn you get the
> ouput as unprocessed html , those span tags, when you pass the same to
> display the content as a webpage they will be processed by the browser and
Forgot:
Are you trying all this from command line? Because thats wehn you get the
ouput as unprocessed html , those span tags, when you pass the same to
display the content as a webpage they will be processed by the browser and
you will see the colored matches.
--KK
On Thu, May 28, 2009 at 11:49
Yes , thats the expected output.
Now put that full content[whatever the searcer returned] in the html page
alongwith the styling for the same, and you will see the matches in yellow
[you chose yellow as color for highlighting].
--KK
On Thu, May 28, 2009 at 11:42 AM, Ritu choudhary wrote:
> I h
I have added the lines you suggested and now its giving the following
output , still can't get what's wrong...
THE CHANGES I HAVE DONE:
SimpleHTMLFormatter formatter =
new SimpleHTMLFormatter("",
"");
Highlighter highlighter = new Highlighter(formatter, new QueryScorer(query
http://stackoverflow.com/questions/195434/how-can-i-get-top-terms-for-a-subset-of-documents-in-a-lucene-index
tomm...@aim.com wrote:
Hi All,
I need to determine top words/phrases in my documents, and?currently using the
ShingleAnalyzerWrapper for indexing.
Through Luke it seems the top terms
Yes, your code is wrong!
Where is the highlighter span/formatter, because from your code what I can
see is that you are just passsing the score to Queryscorer, instead you
should pass both queryscore as well as formatter
>From my previous mail you can see the following code and mimic the same and
i
Well, I'm just playing with these and trying to create distributed
search engine which could store index in JBoss Cache (data grid) and
manage index with GridGain (compute grid) intelligently. Consider one
has really huge index(es) to search (tens or hundreds of gigabytes).
One idea is to distribut
Cool!
1. So you are creating a parser with { name, synonyms, propIn }, correct?
2. Sorry -- I meant the output of "query.toString()"; I'm expecting to see
something like this when the sentence parameter is set to philipcimiano:
name:philipcimiano synonyms:philipcimiano propIn:philipcimian
I've made a bad copy-paste. this is the full class
The output of philipcimiano is ex#pub1-author-ex#res2-name-philipcimiano
I've made a bad copy-paste. this is the full class
public class RDFinder {
private Analyzer analyzer;
private Directory directory;
private IndexSearcher isearc
Hi there.
Perhaps I'm misreading this, but you are not using the "Field"
parameter for query construction, are you? In other words, the
default field used to construct the QueryParser is what's being used
for your query, correct?
Could you post:
1. The code used to construct the QueryPa
public TreeMap> Search(String sentence, String
Field) throws ParseException, IOException{
query = parser.parse(sentence);
try
{
FileWriter fw = new FileWriter ("paths");
BufferedWriter bw = new BufferedWriter (fw);
outFile = new PrintWriter (bw);
Thanks.
Could you also post the code for RDFinder.Search() and the output
from query.toString() when text is "PHILIPCIMIANO"?
-h
On 27-May-2009, at 12:40 PM, Marco Lazzara wrote:
String[] fieldsearch = new String[] {"name", "synonyms", "propIn"};
RDFinder rdfind = new RDFinder("/home/m
String[] fieldsearch = new String[] {"name", "synonyms", "propIn"};
RDFinder rdfind = new RDFinder("/home/marco/testIndex",fieldsearch);
try {
this.paths = this.rdfind.Search(text, "path");
} catch (ParseException e1) {
e1.printStackTrace();
Okay -- that helps.
So we know that searching the same files with Luke works, but with
the web app does not. Can you please re-post the fragment of code
that opens your index and uses the query?
If you haven't already done this, could you also use query.toString()
to confirm the query?
Hi All,
I need to determine top words/phrases in my documents, and?currently using the
ShingleAnalyzerWrapper for indexing.
Through Luke it seems the top terms are correct for the whole index.
Is it possible to determine the top terms for?a subset of documents in the
index?? Or do I need to?cre
NO.the app creates the index in a folder and I run the query in that
folder.
For example if I decide to create the folder in /home/marco/testIndex
,obviously I run the query on /home/marco/testIndex
if I decide to create the folder in /home/marco/RDFLUCENE ,obviously I run
the query on /home/marc
Am i coding it wrongly ...please reply.
Okay -- if the problem is not the number of results, then let's
clarify the problem:
1. You create an index in something like:
/home/marco/testIndex
2. You copy over the directory to something like:
/home/marco/RDFIndexLucene
3. When you run Tomcat, your "searcher" tries to
In my app I obtain 3 results.But I think is not a problem
Marco Lazzara
2009/5/27 Erick Erickson
> StandardAnalyzer is fine. I loaded your index into Luke and there is
> exactly
> one document with philipcimiano in the name field.
> There is only one document that has researcher in the name fie
StandardAnalyzer is fine. I loaded your index into Luke and there is exactly
one document with philipcimiano in the name field.
There is only one document that has researcher in the name field.
Both of these documents (using StandardAnalyzer) return one
document (doc 12 for PHILIPCIMIANO and doc 4
Lucene is more like a search utility library than a full blown Search
Engine like FAST. The Lucene sub project, Solr is more comparable to
FAST, but Solr does not have a built in crawler available either (though
its easy enough to do basic crawls).
There are many open source crawlers you could
Not sure if this applies here, but that tends to happen when the
analyzer you use for indexing is different from the one used in Luke
or you're running into character set issues. Are you using the
StandardAnalyzer in both cases?
Also, could you post an example of the query you are trying?
I'm not certain, without testing it.
I think you and I may have slightly orthogonal needs. From what I gather
you are looking to speed up your search time (by filtering out
irrelevant results), whereas I am simply looking to increase the
relevancy of the results presented to the users when they gr
The most common issue with this kind of thing is that UN_TOKENIZEDimplies no
case folding. So if your case differs you won't get a match.
That aside, the very first thing I'd do is get a copy of Luke (google Lucene
Luke)
and examine the index to see if what's in your index is what you *think* is
i
Warning: I'm almost completely ignorant of JBoss Cache andGridGain.
But it would be useful if you could tell us *why* you want to do this.
If it's a question of speeding up Lucene queries, there are a number
of things that you can do with Lucene itself that may be more
appropriate, but without kno
I seems to be a good solution.
However, I think it may takes some processing time to get the
distribution of all matching documents before scoring each docs.
Would you have a good idea to get the distributions less than some
reasonable time?
On 2009. 05. 26, at 오후 8:15, Joel Halbert wrote
Have a look at Apache droids?
http://incubator.apache.org/droids/
Mike
On Wed, May 27, 2009 at 5:37 AM, gnixinfosoft wrote:
>
> How to implement crawler search in Apache Lucene,
>>
>> I am currently using FAST search engine in my project, which uses crawler
>> facility
>>
>> How to implemen
How to implement crawler search in Apache Lucene,
>
> I am currently using FAST search engine in my project, which uses crawler
> facility
>
> How to implement this using Apache Lucene, I read somewhere that there is
> no
> direct functionality to this in Apache Lucene, but we can implement it
> u
Thank you so much for your patience and support but i am still not
getting the correct result. Here is my code can you please tell me
what wrong have i done in it? (I don't want to use
org.apache.search.hit so i have used terms in place of that)
package highlighted;
import java.io.FileWriter;
i
@Ritu
Wouter's reply must have fixed the problem, right? Or still stuck?
--KK
On Wed, May 27, 2009 at 1:46 PM, Wouter Heijke wrote:
> Hi,
> It sounds to me that you are highlighting the query string and not the
> document. You will have to pass the document's content to
> getBestFragments() and
Hi,
It sounds to me that you are highlighting the query string and not the
document. You will have to pass the document's content to
getBestFragments() and it will work I think.
Wouter
> hi there,
> I am using lucene highlighter to highlight the searched result
> but it shows only the query s
I want to confirm the output of the below statement , what i get into
"result" is just the word i am searching (let's say d word is
registered). How can i get the whole fragment in which the word is
found and show the highlighted word in that fragment or document.
String result =
highlighte
* I see that you have reported the creation of 3 files, but does Luke
recognize those files as an index and do you see the Documents you expect to
see in this index?*
Luke recognizes those files and I see those documents in this index but I
observed that when I run the query Luke finds (for example
37 matches
Mail list logo