Since Lucene 2.9 has per segment searching/caching, does query performance
degrade less than before (2.9) as more segments are added to the index?
Bill
Hi,
Why do I get error like this in Lucene and how to resolve it?
Could not find file 'C:\Indexes_z3_1.del'.
Thanks.
--
View this message in context:
http://www.nabble.com/Resolving-Lucene-Index-error-tp26002849p26002849.html
Sent from the Lucene - Java Users mailing list archive at Nabble.c
Hi,
How do i make sure lucene gives me back relevant search results when my
input string contains terms like c++? Lucene seems to ignore ++ characters.
Thanks
--
View this message in context:
http://www.nabble.com/Handling-%2B-as-a-special-character-in-Lucene-search-tp26002815p26002815.html
S
Hi,
On Mon, Oct 19, 2009 at 10:43 PM, Marcelo Ochoa wrote:
> This is similar approach to Lucene Domain Index:
> http://docs.google.com/Doc?id=ddgw7sjp_54fgj9kg
> But Lucene Domain Index is an specific implementation for Oracle
> Databases 10g/11g which is integrated through the ODCI API and
there is some information on this topic in the pkg summary:
http://lucene.apache.org/java/2_9_0/api/contrib-analyzers/org/apache/lucene/analysis/compound/package-summary.html
in short, for a large list (there is no limit in the code), you will want to
make use of a hyphenation grammar as well:
Hy
Great,
now the next question: which dictionary to do you guys use? How big
can it be?
Is 5 words acceptable?
paul
Le 21-oct.-09 à 21:23, Robert Muir a écrit :
Paul, i think in general scoring should take care of this too, its
all about
your dictionary, same as the previous example.
Hi people,
I have an application in which the users are allowed to make changes
to the database, changes visible only to that user. I.e. they don't
modify the original data, they create a clone of the original. When
the user request the instance I retrieve the modified clone rather
than t
Paul, i think in general scoring should take care of this too, its all about
your dictionary, same as the previous example.
this is because überwachungsgesetz matches 3 tokens: überwachungsgesetz,
überwachung, gesetz
but überwachung gesetz only matches 2.
überwachungsgesetz
0.37040412 = (MATCH) su
just add them to the dictionary, the compound filter will do this
automatically.
if you want to tweak it even further, you can also tell compounds to NOT
emit the subwords if they form a bigger compound with the onlyLongestMatch
parameter i spoke of earlier.
I haven't played with this option much
Can the dictionary have weights?
überwachungsgesetz alone probably needs a higher rank than überwachung
and gesetzt or?
paul
Le 21-oct.-09 à 21:09, Benjamin Douglas a écrit :
OK, that makes sense. So I just need to add all of the sub-compounds
that are real words at posIncr=0, even if th
OK, that makes sense. So I just need to add all of the sub-compounds that are
real words at posIncr=0, even if they are combinations of other sub-compounds.
Thanks!
-Original Message-
From: Robert Muir [mailto:rcm...@gmail.com]
Sent: Wednesday, October 21, 2009 11:49 AM
To: java-user@lu
yes, your dictionary :)
if überwachungsgesetz is a real word, add it to your dictionary.
for example, if your dictionary is { "Rind", "Fleisch", "Draht", "Schere",
"Gesetz", "Aufgabe", "Überwachung" }, and you index
Rindfleischüberwachungsgesetz, then all 3 queries will have the same score.
but i
Thanks for all of the answers so far!
Paul's question is similar to another aspect I am curious about:
Given the way the sample word is analyzed, is there anything in the scoring
mechanism that would rank "überwachungsgesetz" higher than "gesetzüberwachung"
or "fleischgesetz"?
-Original Me
Hi, thanks again for reporting this.
I created an issue here: http://issues.apache.org/jira/browse/LUCENE-2001
On Wed, Oct 21, 2009 at 2:05 AM, parag dave wrote:
> While using the Lucene WordNet package, we found that the Syns2Index
> program
> indexes the Synsets wrongly. For example, looking u
If I recall correctly the highlighter also has an analyzer passed to
it. Ensure that this is the same one as well.
Matt
m.harig wrote:
Thanks erick ,
It works fine , if i use the (code snippet found from nabble) same
analyzer for both indexing & querying .
But the highlighter has gone f
Thanks erick ,
It works fine , if i use the (code snippet found from nabble) same
analyzer for both indexing & querying .
But the highlighter has gone for plural words. Hope i need to search more ,
i'll come back to you once if i can't find out. Thanks again erick.
--
View this message in
thanks erick ,
A little more information would help here.1> Are you using the same analyzer
at both index and query time?
no . sorry , am using StandardAnalyzer at the index time , during querying
am using the code snippet found from nabble.
2> Assuming <1> is "yes", did you re-index your data
A little more information would help here.1> Are you using the same analyzer
at both index and query time?
2> Assuming <1> is "yes", did you re-index your data after you created this
analyzer?
3> What are the results of query.toString()? Looking at that might help you
pinpoint what's going on.
4> H
Paul, there are two implementations in compounds, one is dictionary-based,
the other is hyphenation-grammar + dictionary (it restricts the
decompounding based on hyphenation rules). You could also subclass the
compound base class and implement your own.
I haven't seen any user-measures (relevance,
thanks, this sounds like a bug, I'll play with this today.
On Wed, Oct 21, 2009 at 2:05 AM, parag dave wrote:
> While using the Lucene WordNet package, we found that the Syns2Index
> program
> indexes the Synsets wrongly. For example, looking up the synsets for the
> word "king", we get:
>
> java
hello all
i've a doubt in plural & singular word searching , i've got code
snippet from nabble forum ,
private static Analyzer createEnglishAnalyzer() {
return new Analyzer() {
public TokenStream tokenStream(String fieldName, Reader reader)
{
TokenStream result =
Make sure you are not closing the IndexSearcher while still using a
Hits object obtained from it in the past. Hits goes back and re-runs
the search if you iterate deep enough...
Mike
On Wed, Oct 21, 2009 at 5:39 AM, Ian Lea wrote:
> 1.4.3? How old is that? Maybe time to consider an upgrade.
>
1.4.3? How old is that? Maybe time to consider an upgrade.
Anyway, if you're getting that exception when creating a searcher I
guess you are using a constructor that takes an IndexReader and a
further guess would be that something has closed it.
--
Ian.
On Tue, Oct 20, 2009 at 6:41 PM, Zhang,
I'm interested to this analyzer.. it had escaped me and solves an old
problem!
Could you report about its usage:
- did you have to feed words in a dictionary?
- does anyone have user-measures already?
... and the last question for the research fun: is there any approach
towards preferring Üb
24 matches
Mail list logo