One fast way to make an alphabetic sort very fast is to presort your docs
before adding them to the index. If you do this you can then just sort by
index order. We are using this for a large index (1 million+ docs) and it
works very good, and seems even slightly faster than relevance sorting.
We only display the 10 hits at a time, so we don't need to iterate through all
the hits.
It feels like there should be a way to pull a document out 1 index and stick
it into an other and bring all the unstored fields along with it.
On Friday 07 July 2006 12:52, Erick Erickson wrote:
> Did you
> When you say you keep your documents ordered alphabetically, it's confusing
> to me. Are you saying that you pre-sort all your documents then insert them
> one after another so that automatically-generated internal Lucene ID maps
> exactly to the alphabetical ordering? That is, for any document I
All,
I sent this the other day, but didn't get any responses. I'm hoping that it
was just missed, so I'm trying again.
There has to be a better way to to insert a document in to an index then
reindexing everything.
--Jason
On Wednesday 05 July 2006 5:06 pm, Jason Calabre
All,
For performance reasons we keep our index of over a million documents ordered
alphabeticaly. This way for an alpha sort we can just use the index order.
This works very good, but I'm now looking for a way to insert a single
document to the index in the correct position.
Is there any s
I think the best way to tokening/stem is to use the analyzer directly. for
example:
TokenStream ts = analyzer.tokenStream(field, new StringReader(text));
Token token = null;
while ((token = ts.next()) != null) {
Term newTerm = new Term(field, token.termTe
I just wrote some simple code to test this.
For my test I ran the test with 3 queries:
- A 3 term boolean
- A single term query with over 5000 hits
- A single term query with 0 hits
For each query I ran the ran 4 tests of 10,000 searches:
1) using hits.length to get the counts and the standard si
Maybe I'm missing something simple, but I don't see how this will work.
It looks like this filter will just filter out documents that don't have guid
field, but in my case every document has a guid.
In a single index there are no duplicates. Duplicates are only a problem when
I search multip
All,
In the project I'm working on we have a separate index for each database.
There are 12 databases now. but in the future there may be as many as 20.
They all have their own release cycle so I don't want to merge the indexes.
The databases all have some overlap between them. We manage thi