Karl Koch wrote:
Are there any other papers that regard the combination of coordination level matching and TFxIDF as advantageous?
We independently developed coordination-level matching combined with
TFxIDF when I worked at Apple. This is documented in:
http://www.informatik.uni-trier.de/~
Steven Rowe wrote:
"2.1" is much more likely to be the label used for the next release than
"2.0.1".
The roadmap in Jira shows 21 issues scheduled for 2.0.1. If there is in
fact no intent to merge these into the 2.0 branch, these should probably
be retargetted for 2.1.0, and the 2.0.1 versio
Marcelo Ochoa wrote:
Then I'll move the code outside the lucene-2.0 code tree to be
packed as subdirectory of the contrib area, for example.
Other alternative is to make an small zip file and send it to the
list as attach as a preliminary (alpha-alpha version ;)
This sounds like great potenti
Erick Erickson wrote:
Something like
Document doc = new Document();
doc.add("flag1", "Y");
doc.add("flag2", "Y");
IndexWriter.add(doc);
Fields have overheads. It would be more efficient to implement this as
a single field with a different value for each boolean flag (as others
have suggested
Michael J. Prichard wrote:
I get this output:
Tue Aug 01 21:15:45 EDT 2006
That's August 2, 2006 at 01:15:45 GMT.
20060802
Huh?! Should it be:
20060801
DateTools uses GMT.
Doug
-
To unsubscribe, e-mail: [EMAIL
Rob Staveley (Tom) wrote:
Is there a tool I can use to see how much of the index is occupied by the
different fields I am indexing?
Note that IndexReader has a main() that will list the contents of
compound index files.
Doug
--
Marcus Falck wrote:
There is however one LARGE problem that we have run into. All search result should be displayed sorted with the newest document at top. We tried to accomplish this using Lucene's sort capabilites but quickly ran into large performance bottlenecks. So i figured since the default
Tom Emerson wrote:
Thanks for the clarification. What then is the difference between a
MultiSearcher and using an IndexSearcher on a MultiReader?
The results should be identical. A MultiSearcher permits use of
ParallelMultiSearcher and RemoteSearchable, for parallel and/or
distributed operat
hu andy wrote:
Hi, I hava an application that need mark the retrieved documents which have
been read. So the next time I needn't read the marked documents again.
You could mark the documents as deleted, then later clear deletions. So
long as you don't close the IndexReader, the deletions wil
Sunil Kumar PK wrote:
I want to know is there any possibility or method to merge the weight
calculation of index 1 and its search in a single RPC instead of doing the
both function in separate steps.
To score correctly, weights from all indexes must be created before any
can be searched. This
Is this markedly faster than using an MMapDirectory? Copying all this
data into the Java heap (as RAMDirectory does) puts a tremendous burden
on the garbage collector. MMapDirectory should be nearly as fast, but
keeps the index out of the Java heap.
Doug
z shalev wrote:
I've rewritten
karl wettin wrote:
Do I have to worry about passing a null Directory to the default
constructor?
A null Directory should not cause you problems.
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mai
karl wettin wrote:
I would like to store all in my application rather than using the
Lucene persistency mechanism for tokens. I only want the search
mechanism. I do not need the IndexReader and IndexWriter as that will
be a natural part of my application. I only want to use the Searchable.
Peter Keegan wrote:
Oops. I meant to say: Does this mean that an IndexSearcher constructed from
a MultiReader doesn't merge the search results and sort the results as if
there was only one index?
It doesn't have to, since a MultiReader *is* a single index.
A quick test indicates that it does
Dmitry Goldenberg wrote:
For an enterprise-level application, Lucene appears too file-system and
too byte-sequence-centric a technology. Just my opinion. The Directory
API is just too low-level.
There are good reasons why Lucene is not built on top of a RDBMS. An
inverted index is not effi
Dan Armbrust wrote:
My indexing process works as follows (and some of this is hold-over from
the time before lucene had a compound file format - so bear with me)
I open up a File based index - using a merge factor of 90, and in my
current test, the compound index format. When I have added 100
I talked about this a bit in a presentation at Haifa last year:
http://www.haifa.ibm.com/Workshops/ir2005/papers/DougCutting-Haifa05.pdf
See the section on "Seek versus Transfer".
Doug
Prasenjit Mukherjee wrote:
It seems to me that lucene doesn't use B-tree for its indexing storage.
Any paper
thomasg wrote:
Hi, we are currently intending to implement a document storage / search tool
using Jackrabbit and Lucene. We have been approached by a commercial search
and indexing organisation called ISYS who are suggesting the following
problems with using Lucene. We do have a requirement to st
Igor Bolotin wrote:
Does it make sense to change TermInfosWriter.FORMAT in the patch?
Yes. This should be updated for any change to the format of the file,
and this certainly constitutes a format change. This discussion should
move to [EMAIL PROTECTED]
Doug
--
Igor Bolotin wrote:
If somebody is interested - I can post our changes in TermInfosWriter and
SegmentTermEnum code, although they are pretty trivial.
Please submit this as a patch attached to a bug report.
I contemplated making this change to Lucene myself, when writing Nutch's
FsDirectory, b
Vincent Le Maout wrote:
I am missing something ? Is it intented or is it a bug ?
Looks like a bug. Can you please submit a bug report, and, ideally,
attach a patch?
Thanks,
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
Vincent Le Maout wrote:
I am missing something ? Is it intented or is it a bug ?
Looks like a bug. Can you submit a patch?
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Dai, Chunhe wrote:
Does anyone know whether Lucene plans to support NFS in later
release(2.0)? We are planning to integrate Lucene into our products and
cluster support is definitely needed. We want to check whether NFS
support is in the plan or not before implementing a new file locking
ourselve
Olivier Jaquemet wrote:
IndexReader.unlock(indexDir); // unlock directory in case of unproper
shutdown
This should be used very carefully. In particular, you should only call
it when you are certain that no other applications are accessing the index.
Doug
---
The Hits-based search API is optimized for returning earlier hits. If
you want the lowest-scoring matches, then you could reverse-sort the
hits, so that these are returned first. Or you could use the
TopDocs-based API to retrieve hits up to your "toHits". (Hits-based
search is implemented us
Michael Wechner wrote:
Maybe it would make sense to sort it alphabetically [ ... ]
+1 This should be sorted alphabetically be business name or last name.
That's what it says on the page, although a few entries are out of
place. Please feel free to fix this.
Doug
-
Peter Keegan wrote:
I did some additional testing with Chris's patch and mine (based on Doug's
note) vs. no patch and found that all 3 produced the same throughput - about
330 qps - over a longer period.
Was CPU utilizaton 100%? If not, where do you think the bottleneck now
is? Network? Or
Are you changing the default mergeFactor or other settings? If so, how?
Large mergeFactors are generally a bad idea: they don't make things
faster in the long run and they chew up file handles.
Are all searches reusing a single IndexReader? They should. This is
the other most common reason
Erick Erickson wrote:
Could you point me to any explanation of *why* range queries expand this
way?
It's just what they do. They were contributed a long time ago, before
things like RangeFilter or ConstantScoreRangeQuery were written. The
latter are relatively recent additions to Lucene and
and it seems like performance is basically the same if not better!!!
if anyone is interested let me know
Doug Cutting <[EMAIL PROTECTED]> wrote:
RAMDirectory is indeed currently limited to 2GB. This would not be too
hard to fix. Please file a bug report. Better yet, attach a patch.
Dawid Weiss wrote:
I get the concept implemented in PhraseQuery but isn't calling it an
edit distance a little bit far fetched?
Yes, it should probably be called "edit-distance-like" or something.
Only the marginal elements
(minimum and maximum distance from their respective query positions)
RAMDirectory is indeed currently limited to 2GB. This would not be too
hard to fix. Please file a bug report. Better yet, attach a patch.
I assume you're running a 64bit JVM. If so, then MMapDirectory might
also work well for you.
Doug
z shalev wrote:
this is in continuation of a pr
Peter Keegan wrote:
I ran a query performance tester against 8-cpu and 16-cpu Xeon servers
(16/32 cpu hyperthreaded). on Linux. Here are the results:
8-cpu: 275 qps
16-cpu: 305 qps
(the dual-core Opteron servers are still faster)
Here is the stack trace of 8 of the 16 query threads during the
WATHELET Thomas wrote:
I've created an index with the Lucene version 1.9 and when I try to open
this index I have always this error mesage:
java.lang.ArrayIndexOutOfBoundsException.
if I use an index built with the lucene version 1.4.3 it's working.
Wath's wrong?
Are you perhaps trying to open
Release 1.9.1 of Lucene is now available from:
http://www.apache.org/dyn/closer.cgi/lucene/java/
This fixes a serious bug in 1.9-final. It is strongly recommended that
all 1.9-final users upgrade to 1.9.1. For details see:
http://svn.apache.org/repos/asf/lucene/java/tags/lucene_1_9_1/CHANGES.
Release 1.9-final of Lucene is now available from:
http://www.apache.org/dyn/closer.cgi/lucene/java/
This release has many improvements since release 1.4.3, including new
features, performance improvements, bug fixes, etc. For details, see:
http://svn.apache.org/viewcvs.cgi/*checkout*/lucene/j
Jeff Rodenburg wrote:
Following on the Range Query approach, how is performance? I found the
range approach (albeit with the exact values) to be slower than the
parsed-string approach I posited.
Note that Hoss suggested RangeFilter, not RangeQuery. Or perhaps
ConstantScoreRangeQuery, which i
Eric Jain wrote:
This gives you the number of documents containing the phrase, rather
than the number of occurrences of the phrase itself, but that may in
fact be good enough...
If you use a span query then you can get the actual number of phrase
instances.
Doug
---
revati joshi wrote:
hi all,
I just wnted to know how to increase the speed of indexing of files .
I tried it by using Multithreading approach but couldn't get much better
performance.
It was same as it is in usual sequential indexing.Is there any other approach
to get better Inde
Release 1.9 RC1 of Lucene is now available from:
http://www.apache.org/dyn/closer.cgi/lucene/java/
This release candidate has many improvements since release 1.4.3,
including new features, performance improvements, bug fixes, etc. For
details, see:
http://svn.apache.org/viewcvs.cgi/*checkout*/
Trieschnigg, R.B. (Dolf) wrote:
I would like to implement the Okapi BM25 weighting function using my own
Similarity implementation. Unfortunately BM25 requires the document length in
the score calculation, which is not provided by the Scorer.
How do you want to measure document length? If th
Paul Smith wrote:
is 1.9 binary backward compatible? (both source code and index format).
That is the intent. Try a nightly build:
http://cvs.apache.org/dist/lucene/java/nightly/
Doug
-
To unsubscribe, e-mail: [EMAIL PROTEC
Sebastian Menge wrote:
Or, to put it more simple, what does a boost of "2" or "10" _mean_ in
contrast to a boost of "0.5" or "0.1" !?
Boosts are simply multiplied into scores. So they only mean something
in the context of the rest of the scoring mechanism.
http://lucene.apache.org/java/docs
Paul Smith wrote:
We're using Lucene 1.4.3, and after hunting around in the source code
just to see what I might be missing, I came across this, and I'd just
like some comments.
Please try using a 1.9 build to see if this is something that's perhaps
already been fixed.
CompoundFileReader
Daniel Pfeifer wrote:
Are we both talking about Lucene? I am using Lucene 1.4.3 and can't find
a class called MapDirectory or MMapDirectory.
It is post-1.4.
You can download a nightly build of the current trunk at:
http://cvs.apache.org/dist/lucene/java/nightly/
Doug
---
Daniel Pfeifer wrote:
We are sporting Solaris 10 on a Sun Fire-machine with four cores and
12GB of RAM and mirrored Ultra 320-disks. I guess I could try switching
to FSDirectory and hope for the best.
Or, since you're on a 64-bit platform, try MMapDirectory, which supports
greater parallelism
Doug Cutting wrote:
A 64-bit JVM with NioDirectory would really be optimal for this.
Oops. I meant MMapDirectory, not NioDirectory.
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL
Peter Keegan wrote:
The throughput is worse with NioFSDIrectory than with the FSDIrectory
(patched and unpatched). The bottleneck still seems to be synchronization,
this time in NioFile.getChannel (7 of the 8 threads were blocked there
during one snapshot). I tried this with 4 and 8 channels.
Peter Keegan wrote:
This is just fyi - in my stress tests on a 8-cpu box (that's 8 real cpus),
the maximum throughput occurred with just 4 query threads. The query
throughput decreased with fewer than 4 or greater than 4 query threads. The
entire index was most likely in the file system cache, t
Daniel Rabus wrote:
I've created an Semantic Desktop application using Lucene. For a
presentation I'd like to create a poster. Unfortunately I haven't found
any high resolution version (or vector graphic) of the Lucene logo. At
http://svn.apache.org/repos/asf/lucene/java/trunk/docs/images/ only
B-Tree's are best for random, incremental updates. They require
log_b(N) disk accesses for inserts, deletes and accesses, where b is the
number of entries per page, and N is the total number of entries in the
tree. But that's too slow for text indexing. Rather Lucene uses a
combination of fi
Klaus wrote:
I have tried to study to lucene scoring in the default similarity. Can
anyone explain me, how this similarity was designed? I have read a lot of IR
literature, but I have never seen an equation like the one used in lucene.
Why is this better then the normal cosine-measure?
It degen
chandler burgess wrote:
Im using lucene1.4.3 on a XP machine with jdk1.5. Any help is appreciated.
Try typing control-break to get some stack dumps. I also recommend
building the current Lucene code from subversion and trying that. There
have been lots of improvements since 1.4.3.
It woul
J.J. Larrea wrote:
So... I notice that both IndexWriter.addIndexes(...) merge methods start
and end with calls to optimize() on the target index. I'm not sure
whether that is causing the unpacking and repacking I observe, but it
does wonder whether they truly need to be there:
I don't recall
Andrzej Bialecki wrote:
It's nice to have these couple percent... however, it doesn't solve the
main problem; I need 50 or more percent increase... :-) and I suspect
this can be achieved only by some radical changes in the way Nutch uses
Lucene. It seems the default query structure is too compl
Paul Elschot wrote:
Querying the host field like this in a web page index can be dangerous
business. For example when term1 is "wikipedia" and term2 is "org",
the query will match at least all pages from wikipedia.org.
Note that if you search for wikipedia.org in Nutch this is interpreted
as a
Andrzej Bialecki wrote:
For a simple TermQuery, if the DF(term) is above 10%, the response time
from IndexSearcher.search() is around 400ms (repeatable, after warm-up).
For such complex phrase queries the response time is around 1 sec or
more (again, after warm-up).
Are you specifying -server
IndexReader locks the index while opening it to prohibit an IndexWriter
from deleting any of the files in that index until all are opened.
Lock files are not stored in the index directory since write access to
an index should not be required to lock it while opening an IndexReader.
Doug
Dani
Jay Booth wrote:
I had a similar problem with threading, the problem turned out to be that in
the back end of the FSDirectory class I believe it was, there was a
synchronized block on the actual RandomAccessFile resource when reading a
block of data from it... high-concurrency situations caused t
Daniel Noll wrote:
Doug Cutting wrote:
Daniel Noll wrote:
I actually did throw a lot of terms in, and eventually chose "one"
for the tests because it was the slowest query to complete of them
all (hence I figured it was already spending some fairly long time in
I/O, and would be
Daniel Noll wrote:
I actually did throw a lot of terms in, and eventually chose "one" for
the tests because it was the slowest query to complete of them all
(hence I figured it was already spending some fairly long time in I/O,
and would be penalised the most.) Every other query was around 7ms
Greg K wrote:
Now, however, I'd like to be able restrict the search to certain documents
in the index, so I don't have to stream through a couple of thousand spans
to produce the 10 excerpts on a subset of the documents.
I've tried added a term to the SpanNearQueries that targets a keyword field
Daniel Noll wrote:
Timings were obtained by performing the same search 1,000 times and
averaging the total time. This was then performed five times in a row
to get the range that's displayed below. Memory usage was obtained
using a 20-second sleep after loading the index, and then using the
Win
Marvin Humphrey wrote:
You *can't* set it on the reader end. If you could set it, the reader
would get out of sync and break. The value is set per-segment at write
time, and the reader has to be able to adapt on the fly.
It would actually not be too hard to change things so that there was
Chris Hostetter wrote:
: One thing that I know has bogged me is when matching a phrase where I
: would expect mathematical formula (which is "just a subphrase"). I
: would have liked the phrase-query to extend as far as it wishes but not
: passed a given token... would this be possible ?
: Presum
Erik Hatcher wrote:
On 28 Oct 2005, at 22:31, Andy Lee wrote:
You know what, I was confusing Nutch and Lucene classes (as I've done
before), in this case the IndexSearcher classes.
Sorry. The Nutch names are bad.
I'm continually amazed at Doug's ability to build these using
only emacs - h
Marc Hadfield wrote:
In the SpanNear (or for that matter PhraseQuery), one can set a slop
value where 0 (zero) means one following after the other.
How can one differentiate between Terms at the **same** position vs. one
after the other?
The following queries only match "x" and "y" at the sa
Marc Hadfield wrote:
I'll give Span Query's a try as they can handle the 0 increment issue.
Note that PhraseQuery can now handle this too.
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMA
Marc Hadfield wrote:
I actually mention your option in my email:
In principle I could store the full text in two fields with the second
field containing the types without incrementing the token index.
Then, do a SpanQuery for "Johnson" and "name" with a distance of 0.
The resulting match w
Marc Hadfield wrote:
I would prefer not to mix the full text and "types" in the same field as
it would make the term positions inconsistent which i depend on for
other queries.
Why not store them in the same field using positionIncrement=0 for the
types? Then they won't change positions of n
Peter Kim wrote:
I noticed one way to get around this is to use IndexReader.isDeleted()
to check if it's deleted or not. The problem with that is I only have
access to a MultiSearcher in my HitCollector which doesn't give me
access to the underlying IndexReader. I don't want to have to open an
In
Eric Louvard wrote:
my problem is that IndexWriter.optimize() take 20 minutes. OK it is not
a lot of time, but I can't allow me to block the system such a long time
:-(.
If you're worried about blocking, queue changes to the index and have a
separate thread which processes the queue, adding a
Palmer, Andrew MMI Woking wrote:
I am looking at changing the value BufferedIndexOutput.BUFFER_SIZE from
1024 to maybe 8192. Has anyone done anything similar and did they get
any performance improvements.
I doubt this will speed things much.
Generally I am looking to reduce the time it ta
Dawid Weiss wrote:
I have a very technical question. I need to alter document score (or in
fact: document boosts) for an existing index, but for each query. In
other words, I'd like these to have pseudo-queries of the form:
1. civil war PREFER:shorter
2. civil war PREFER:longer
for these two
Jeff Rodenburg wrote:
My suggestion to you: pick up a copy of Lucene in Action. [ ...]
The authors lurk on this list.
They're pretty chatty for lurkers.
http://en.wikipedia.org/wiki/Lurker
But good advice nonetheless!
Cheers,
Doug
---
Tony Schwartz wrote:
What about the TermInfosReader class? It appears to read the entire term set
for the
segment into 3 arrays. Am I seeing double on this one?
p.s. I am looking at the current sources.
see TermInfosReader.ensureIndexIsRead();
The index only has 1/128 of the terms, by def
Chris D wrote:
Well in my case field order is important, but the order of the
individual fields isn't. So I can speed up getFields to roughly O(1)
by implementing Document as follows.
Have you actually found getFields to be a performance bottleneck in your
application? I'd be surprised if it
Fredrik wrote:
Opening the index with Luke, I can see the following:
Number of fields: 17
Number of documents: 1165726
Number of terms: 6721726
The size of the index is approx 5,3 GB.
Lucene version is 1.4.3.
The index contains Norwegian terms, but lots of inline HTML, etc
is probably increasin
Tony Schwartz wrote:
I think you're jumping into the conversation too late. What you have said here
does not
address the problem at hand. That is, in TermInfosReader, all terms in the
segment get
loaded into three very large arrays.
That's not true. Only 1/128th of the terms are loaded by
Ali Rouhi wrote:
I can think of 3 reasons why search methods returning Hits objects
are not exposed in Searchable:
1) Someone forgot to declare Hits Serializable
2) There is a fundamental reason the forms of search which return Hits
objects cannot be called remotely, some non optimal form of se
Tony,
If your improvements are of general utility, please contribute them.
Even if they are not, post them as-is and perhaps someone will take the
time to make them more reusable.
Cheers,
Doug
Tony Schwartz wrote:
I think there are a few things that should be added to lucene to really give
Lokesh Bajaj wrote:
For a very large index where we might want to delete/replace some documents,
this would require a lot of memory (for 100 million documents, this would need
381 MB of memory). Is there any reason why this was implemented this way?
In practice this has not been an issue. A
The method Similarity.queryNorm() normalizes query term weights. To
disable this you could define it to return 1.0 in your own Similarity
implementation.
http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.html#queryNorm(float)
Doug
Robichaud, Jean-Philippe wrote:
Ok,
Fred Toth wrote:
I'm thinking we need something like "HTMLTokenizer" which bridges the
gap between StandardAnalyzer and an external HTML parser. Since so
many of us are dealing with HTML, I would think this would be generally
useful for many problems. It could work this way:
Given this input:
H
Sebastian Marius Kirsch wrote:
I took up your suggestion to use a ParallelReader for adding more
fields to existing documents. I now have two indexes with the same
number of documents, but different fields.
Does search work using the ParalleReader?
One field is duplicated
(the id field.)
Wh
Tansley, Robert wrote:
What if we're trying to index multiple languages in the same site? Is
it best to have:
1/ one index for all languages
2/ one index for all languages, with an extra language field so searches
can be constrained to a particular language
3/ separate indices for each language
Matt Quail wrote:
I have a similar problem, for which ParallelReader looks like a good
solution -- except for the problem of creating a set of indices with
matching document numbers.
I have wondered about this as well. Are there any *sure fire* ways of
creating (and updating) two indices so
Chris Lamprecht wrote:
I've done exactly what you describe, using N threads where N is the
number of processors on the machine, plus one more thread that writes
to the file system index (since that is I/O-bound anyway). Since most
of the CPU time is tokenizing/stemming/etc, the method works well.
Steven J. Owens wrote:
A friend just asked me for advice about synchronizing lucene
indexes across a very large number of servers. I haven't really
delved that deeply into this sort of stuff, but I've seen a variety of
comments here about similar topics. Are there are any well-known
approach
Scott Smith wrote:
Any other solutions or comments?
Use a different IndexReader for searching than you use for deletions?
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Robichaud, Jean-Philippe wrote:
How cool, I did not knew that... that may help me... If I understand you
correctly, I can create a boolean query where each "clause" use a different
similarity ?
Yes. That would look something like:
BooleanQuery booleanQuery = new BooleanQuery();
TermQuery clause1
Robichaud, Jean-Philippe wrote:
Again, I can change
the similarity of the reader at run-time and issue specific queries, summing
the score myself, but that is pretty inefficient.
You can also specify a Similarity implementation per Query node in a
complex query, e.g.:
BooleanQuery query = new Boo
Chuck Williams wrote:
I found this to be a problem as well and created
alternative classes, DistributedMultiFieldQueryParser and
MaxDisjunctionQuery, which are available here:
http://issues.apache.org/bugzilla/show_bug.cgi?id=32674
You might check these out and see if they provide the ranking y
Morus Walter wrote:
Alternatively it should be able to write a query that does such a scoring
directly (without the document start anchor) by the same means proximity
query uses. Proximity query uses positional information so it should be
possible to use that information for scoring based on docum
Yonik Seeley wrote:
I don't think at this point anything structural has been proposed as
different between 1.9 and 2.0.
Are any of Paul Elschot's query and scorer changes being considered for 2.0?
1.9 and 2.0 will be what's in the SVN trunk. Many of Paul's changes
have already been committed. Ar
George Aroush wrote:
I would like to see a source release of 1.9, a packaged source release as
ZIP/TAR. Is that possible?
There is no 1.9 release. It is a *planned* release at this point. When
a release is actually made, then you will be able to download it.
Doug
--
Peter Veentjer - Anchor Men wrote:
I have question about field boosting.
If I have 2 (or more) fields with the same fieldname in a single
document, and I boost one of those, than only that one will be boosted?
Or will all fields with the same name be boosted? I guess only one field
is boosted, bu
Roy Klein wrote:
I think this is a better way of asking my original questions:
"Why was this designed this way?"
In order to optimize updates.
"Can it be changed to optimize updates?"
Updates are fastest when additions and deletions are separately batched.
That is the design.
Doug
-
Yonik Seeley wrote:
There are times, however, when it would be nice for
deletes to be able to be concurrent with adds.
It would also be nice if good coffee was free.
Q: can docids change after an add() (with merging segments going on
behind the scenes) or is optimize() the only call that ends up
ch
Paul Libbrecht wrote:
I am currently evaluating the need for an elaborate query data-structure
(to be exchanged over XML-RPC) as opposed to working with plain strings.
I'd opt for both. For example:
"java based" -coffee
site
apache.org
d
1 - 100 of 122 matches
Mail list logo