Well, the approach you suggested is what we use now. We regex use pattern
matching to find the search term out. However, due to this we cannot use
some of the very sophisticated queries which lucene supports (like boolean
query etc). We sure can use highlighting to find out this information. But
h
Arsen,
I already mentioned it (see below) - LingPipe - http://alias-i.com .
Otis
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/ - Tag - Search - Share
- Original Message
From: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
To: java-user@lucene.
Erick,
Thanks for the advice. I will take a look at
PerFieldAnalyzerWrapper to see if I want to take this on. For my
case, I have to use mexed case for a couple of fields since case
really does matter for them (ie apple is not the same as Apple), and I
actually don't want users to find the d
Hi Mark,
Do you know of a good paid product that does this?
Thanks,
Arsen
- Original Message
From: Mark Miller <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Wednesday, May 2, 2007 7:52:36 AM
Subject: Re: Keyphrase Extraction
>From what I know you generally have to pay if
On Monday 07 May 2007 06:19:47 makkhar wrote:
> Here's what is going wrong for me :
>
> I have 10 documents, each with 10 fields with "parameterName and
> parameterValue". Now, When i search for some term and I get 5 hits, how do
> I find out which paramName-Value pair matched ? Very simple a probl
On 5/6/07, Erick Erickson <[EMAIL PROTECTED]> wrote:
On 5/5/07, Daniel Einspanjer <[EMAIL PROTECTED]> wrote:
>
> The query syntax reference page talks about the NOT and the - operators,
> but
> it wasn't clear to me what exactly the difference is between
them. Could
> someone tell me briefly wh
Hello everyone,
Whenever I search a word in my web application, I search in some default
fields,
e.g. I search the word "hello", I generate these queries :
title:hello
headlines:hello
summary:hello
content:hello
Which I add in a BooleanQuery (BooleanClause.Occur.SHOULD)
What I want to achieve
Here's what is going wrong for me :
I have 10 documents, each with 10 fields with "parameterName and
parameterValue". Now, When i search for some term and I get 5 hits, how do I
find out which paramName-Value pair matched ? Very simple a problem, but I
could find no information on the forum for t
Moti,
I tried your test and it fails in the way you describe, however, I don't think
the test shows a bug.
Below is the javadoc comment for the package private class NearSpansOrdered.
Would that be sufficient documentation for the ordered case?
/** A Spans that is formed from the ordered subspa
Well, "falls between a certain range" is problematical. There's
nothing hard and fast about scoring. That is, scores between, say,
two different queries are not comparable.
But I really don't understand the question. You won't get
"unrelated stuff" in your result set as far as I know. Everything
See below...
On 5/5/07, Daniel Einspanjer <[EMAIL PROTECTED]> wrote:
The query syntax reference page talks about the NOT and the - operators,
but
it wasn't clear to me what exactly the difference is between them. Could
someone tell me briefly what that difference might be or point me at some
f
Looking over the implementation of SpanNearQuery I came upon what looked
like a bug. Below is a test which fails due to it. SpanNearQuery doesn't
return all matching spans; once it's found a span it always increments the
span of the clause appearing first in that span (ie. in the example below
the
How do you specify cutoff on search results? If I want to sort the
search result, on other than relevancy, I don't want non related stuff
showing up at the top. Is there way to set a cutoff, so only result
that falls between certain range are displayed?
Thanks.
You could also try splitting the document into paragraphs and use Carrot2's
Lingo algorithm (www.carrot2.org) on a paragraph-level to extract clusters.
Labelling routine in Lingo should extract 'key' phrases; this analysis is
heavily frequency-based, but... you know, you may want to try it.
14 matches
Mail list logo