Using term vectors means passing on the terms too many times - i.e
- loop on terms
- - loop on docs of a term
- - - loop on terms of a doc
Would something like this be better:
do {
System.out.println(tenum.term()+" appears in "+tenum.docFreq()+"
docs!");
TermDocs td = reader.termDo
: this thread that Hoss's solution was perfect and I indeed was able to add a
: new dynamically changeable Term frequency relevance scoring system. The
Cool ... "PINE is my IDE."
-Hoss
-
To unsubscribe, e-mail: [EMAIL PROTE
Hi,
How about this:
1) You copy the files that make your index in a new folder
2) You update your index in that new folder (forcing if necessary, old locks
will not be valid)
3) When update is completed, close your readers, and open them on the new
index.
4) Copy the fresh index files to the pre
OOM Errors are not uncommon during redeployment on application server
e.g. servlet container. Redeploy on Tomcat servers very often cause
OOM due to the perm gen space which get not GCed(that should go away
with 5.5). The JBoss can usually deal with these issue but just in
case you could check you
I will be out of the office starting 12/21/2006 and will not return until
01/02/2007.
I will respond to your message when I return.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Are you using 2.1-dev version of Lucene? Try the latest nightly build, it as a
fix for a certain OOM bug (see LUCENE-754).
Otis
- Original Message
From: Van Nguyen <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Wednesday, December 20, 2006 6:39:58 PM
Subject: JAVA JVM Questio
I have an index that's approximately 875MB. I'm using JBoss Application
Server 4.04 w/ Apache HTTP Server 2.2. My min/max JVM size is:
128MB/512MB. On initial startup, everything works fine. I'm able to
search (although it takes a while doing the first search because it's
loading the index into
To populate FieldCache, the number of matches doesn't matter. There is no need
to be scrimy there - you don't really save anything by running a query that
matches only a few docs. Just run something that looks like a common query.
For warming up new indices, one can also use the `dd' trick und
One question about this, Otis... When "warming up" the new searcher,
should the query return a lot of results, or does it matter? Can I just
do like an ID = X query and get one document back? Is that sufficient
or is it better to run a query that will get lots of hits?
Thanks again,
Bryan
-
Sounds like a possibility Otis, I know we are indeed using sort other
than the default. I'll try out your suggestion. Thanks!
Bryan
-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Wednesday, December 20, 2006 3:28 PM
To: java-user@lucene.apache.org
Subject: Re
All sounds good. Opening a new IndexReader can take a bit of time. If you use
sorting of any kind other than default sorting by relevance, this delay on the
first search is also probably caused by the lazy FieldCache population. The
cure for that is to open a new IndexReader/Searcher before y
Mr. Hostetter, you are a Godsend. Just wanted to report to anyone following
this thread that Hoss's solution was perfect and I indeed was able to add a
new dynamically changeable Term frequency relevance scoring system. The
value of such a thing may not be high, but man do I love Lucene for making
I'm investigating some performance issues with the way we're using
Lucene in our web app and am interested if anyone could shed some light
on what might be going on. Hopefully I can provide enough information,
please let me know if there's more I can give.
We're using Lucene 2.0.0 and I'm curr
I figured it out. Gopi asked me some questions that got me searching and it
turns out my JVM wasn't 1.5.06, it was 1.4.2. I grabbed the newest version
and made it the default JVM and now I no longer have the problem.
Thanks a bunch for your help Gopi.
JT
JT Kimbell wrote:
>
> I've sent the
It's definitely my understanding that this is not possible. Maybe somebody
can give you a hardcore way of doing it by subclassing one of the classes
involved in indexing, but I'm too green for that :)
One solution that may or may not work depending on how specific you want to
get is that you can
Yes I want to do boost in indexing time.
But I want to do boost for terms instead of fields. I want to give
different weight for
different terms even if the field of two terms are same.
For example,
doc A contains field1 : term1 (weight C) field1 : term2 (weight F)
I want to give diffe
Why not switch where the searchers look rather than copy the index and
restart? That is, your searcher is pointing at index1, and you build the new
one in a a new dir (index2). On some signal, your server closes the searcher
pointing to index1 and opens one pointing to index2 and uses that until
t
On Wednesday 20 December 2006 17:32, Martin Braun wrote:
> so a doc from 1973 should get a boost of 1.1973 and a doc of 1975 should
> get a boost of 1.1975 .
The boost is stored with a limited resolution. Try boosting one doc by 10,
the other one by 20 or something like that.
Regards
Daniel
-
Note: I have changed the title of this thread to match its content
I am currently facing a similar issue. I am dealing with a large index
that is constantly used and needs to be updated on a daily basis. For
fear of corruption I would rather rebuild the index each time,
performing tests against
Hello all,
I am trying to boost more recent Docs, i.e. Docs with a greater year
Value like this:
if (title.getEJ() != null) {
titleDocument.setBoost(new Float("1." + title.getEJ()));
}
so a doc from 1973 should get a boost of 1.1973 and a do
>Please try using the MultiFieldQueryParser's constructor, not the static
>>method. I think that might fix your problem.
Yes, after I created a new MultiFieldQueryParser and calling the parse( String
query) method my search executed as expected.
Thanks for your help!
Scott
>> BooleanClause.O
My first question is how many documents would you be deleting on a pass for
option 2? If it's 10 documents out of 10,000, I'd consider just deleting
them and re-adding (see IndexModifier).
Personally, if posible, I prefer your first option, building a completely
new index and switching between th
Hello Gentlemen (+Ladies?),
I'm integrating Lucene into a Spring web-app, and have found a plethora of
great web + print resources to make the integration quick and seamless. One
thing that I have been hard-pressed to find is a good solution for rebuilding
the index on a regular basis.
I'm
I've sent the code your way. I'm downloading eclipse right now so I can step
through with its debugger once I get it all set up.
However, I don't think I am using the same index for each of them, as this
is all actually on 3 different machines. Machine A has 1.4.3 and I wrote
that code on tha
I don't think you want to do this at index time, but rather search time.
Quoting from Hoss (?)...
Index time field boosts are a way to express things like "this documents
title is worth twice as much as the title of most documents". Query time
boosts are a way to express "I care about matches on
All I could suspect is perhaps you are trying to add documents to an index
that was originally created using Lucene 1.4.3.
If trying to create a fresh index doesn't work, you could send me your
indexer code so I can take a look.
-Gopi
On 12/19/06, JT Kimbell <[EMAIL PROTECTED]> wrote:
Hi,
Hi,
I am trying to figure out how to give different weights to different
terms in a same document.
Anybody knows how to do this?
For example,
doc A contains field1 : term1 (weight C) field1 : term2 (weight F)
If I use setBoost(float) function in the Field Object, I cannot give
differ
:
: problem reamins that I would like to be able to switch between the hits
: per doc Similarity and the default Similarity on any given search. I
: was hoping that I could index with DefaultSimilarity and store the norms
: for normal relevancy searching. Then I would need to ignore or make
: cons
28 matches
Mail list logo