Hi list,
let's take a simple example. A TokenFilter creates the terms "i" and "pod"
from the word "ipod".
This example is simple and if all usecases for the self-made tokenFilter
were like this, I could do the whole thing on index-side. However, it is not
- WordDelimiterFilter is no option.
The p
On Wed, Jun 8, 2011 at 7:52 AM, Mohamed Yahya wrote:
> You're right. Still, I am not sure if there is a library that would
> take care of examples such as the one I gave.
>
which is why you might want to just pick one that is close to what you
want, and then customize/tune it with any stuff parti
Hello everybody,
I am just curious about following case.
Currently, I create a boolean AND query which loads payloads.
In some cases it occurs that Lucene loads payloads but does not return hits.
Therefore, I assume that payloads are directly loaded whith each doc ID from
the posting list before
On Wed, Jun 8, 2011 at 6:52 PM, Elmer wrote:
> the parsed query becomes:
>
> '+(title:the) +(title:project desc:project)'.
>
> So, the problem is that docs that have the term 'the' only appearing in
> their desc field are excluded from the results.
Subclass MFQP and override getFieldQuery.
If th
Perhaps "least frequent substring" or even "suffix truncation" might be enough
for your needs.
Here is a related paper: http://web.jhu.edu/bin/q/b/p75-mcnamee.pdf
karl
On Jun 8, 2011, at 1:52 PM, Mohamed Yahya wrote:
> You're right. Still, I am not sure if there is a library that wo
I'm sure you are right and I'm wrong - sorry for the waste of space.
However I still think you should build it all up in code.
--
Ian.
On Wed, Jun 8, 2011 at 4:33 PM, Elmer wrote:
>> Using MFQP with AND
>> everywhere you'll never get a match if some fields don't contain all
>> of the search te
> Using MFQP with AND
> everywhere you'll never get a match if some fields don't contain all
> of the search terms"
I'm sorry to say, but that's not true I guess, look how the query parser
parses the following query:
'information retrieval'
--parsed-to-->
+(title:inform description:inform authors.
Then surely the stop word issue is a red herring. Using MFQP with AND
everywhere you'll never get a match if some fields don't contain all
of the search terms.
Even if Erick's exact answer won't apply, I suspect that building up a
composite boolean query is the way to go.
--
Ian.
On Wed, Jun 8
Sorry, I made a mistake here:
> Unfortunately, the solution that Erick gave won't do the trick
> > > bq.add(qp.parse("title:(the AND project)", SHOULD))
> > > bq.add(qp.parse("desc:(the AND project)", SHOULD))
> This still won't match documents where both 'the' and 'project' appear
> in DIFFERENT
Thank you,
I already use the PerFieldAnalyzerWrapper (by Hibernate Search) ;)
And that's where the problem comes in: different fields using different
analyzers (some with, some without a stopfilter). For each term
(tokenized by MFQP itself?), it applies the given analyzer on each
field. If the ana
You're right, that's a better place to start
Erick
On Wed, Jun 8, 2011 at 9:42 AM, Ian Lea wrote:
> Except that I think he has loads of other fields and wants to keep it simple.
>
> But how about passing a PerFieldAnalyzerWrapper instance as the
> analyzer to MFQP? Worth a try.
>
>
> --
> I
Except that I think he has loads of other fields and wants to keep it simple.
But how about passing a PerFieldAnalyzerWrapper instance as the
analyzer to MFQP? Worth a try.
--
Ian.
On Wed, Jun 8, 2011 at 2:38 PM, Erick Erickson wrote:
> Could you just construct a BooleanQuery with the
> term
Could you just construct a BooleanQuery with the
terms against different fields instead of using MFQP?
e.g.
bq.add(qp.parse("title:(the AND project)", SHOULD))
bq.add(qp.parse("desc:(the AND project)", SHOULD))
etc...? If your QueryParser was created with a
PerFieldAnalyzerWrapper I think you mig
I guess the base problem is that MFQP only accepts one analyzer.
Presumably you are using different analyzers for your title and desc
fields, and it might do what you wanted if you could pass in a list of
analyzers along with a list of fields. Sounds like something that
might not be too hard to co
glad you found it. I'd still recommend you get a copy of Luke, though,
it's invaluable.
Best
Erick
On Wed, Jun 8, 2011 at 8:49 AM, Pranav goyal wrote:
> Hi Erick,
>
> Thanks for the answer, before using Luke I got where I am making a mistake,
> and I replied it here.
>
> But thanks for the r
Hi Erick,
Thanks for the answer, before using Luke I got where I am making a mistake,
and I replied it here.
But thanks for the reply.
On Wed, Jun 8, 2011 at 6:14 PM, Erick Erickson wrote:
> hard to say. You should get a copy of Luke and inspect your index to
> see if what you
> think you put t
hard to say. You should get a copy of Luke and inspect your index to
see if what you
think you put there is actually there. When you added data to your
index, did you
perform a commit?
Best
Erick
On Wed, Jun 8, 2011 at 2:45 AM, Pranav goyal wrote:
> There is one field DocId which I am storing as
You're right. Still, I am not sure if there is a library that would
take care of examples such as the one I gave.
On Wed, Jun 8, 2011 at 11:25, Lahiru Samarakoon wrote:
> Hi,
>
>>
>> Is there something in Lucene that supports lemmatization of the following
>> form:
>>
>> Mexican --> Mexico (from
Oh sry,
I got my error and it worked.
Thanks
On Wed, Jun 8, 2011 at 3:57 PM, Pranav goyal wrote:
> import java.io.File;
> import java.io.IOException;
> import java.util.Collection;
> import java.util.Iterator;
> import java.util.List;
> import java.util.Map;
>
> import org.apache.lucene.analysi
import java.io.File;
import java.io.IOException;
import java.util.Collection;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
Hi,
>
> Is there something in Lucene that supports lemmatization of the following
> form:
>
> Mexican --> Mexico (from adjective to name/noune)
>
> Lemmatization do not change part of speech. I think you are looking for a
stemming algorithm.
http://nlp.stanford.edu/IR-book/html/htmledition/stemmi
Hi,
Is there something in Lucene that supports lemmatization of the following form:
Mexican --> Mexico (from adjective to name/noune)
Thanks
Mohamed Yahya
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For a
Hi,
I have a use case in which I use the MultiFieldQueryParser (MFQP) on
some fields that use and some fields that don't use a stopfilter. The
default operator of the MFQP is set to AND.
For example, if the search query is 'the project' (with 'the' included
in the stoplist) and the search fields a
23 matches
Mail list logo