Hi all:
I want first erase the original index and then create an index for
appending, I use the following python code using ports pyLucene.
def store(doc)
store = PyLucene.FSDirectory.getDirectory("index", True)
writer = PyLucene.IndexWriter(store, StandardAnalyzer, False ) #
e
David <[EMAIL PROTECTED]> wrote on 14/01/2007 20:08:05:
> thanks, How do Lucene give each document an ID when the document is
added?
> Is the document ID unchanged until the document is deleted?
>
Not exactly.
When the first doc is added, it is assigned id 0.
Next one assigned id 1, etc.
When a
:I'm wondering what will happened if I performance indexing and have 10
: peoples do searching at the same time? Can I retrieve the results while I do
: index, and the other way around?
>From the FAQ
http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-6c56b0449d114826586940dcc6fe5158267
thanks, How do Lucene give each document an ID when the document is added?
Is the document ID unchanged until the document is deleted?
2007/1/12, Otis Gospodnetic <[EMAIL PROTECTED]>:
David, please look at the Javadoc for IndexReader. I believe the API is
reader.document(int), where reader is
Hi,
I'm wondering what will happened if I performance indexing and have 10
peoples do searching at the same time? Can I retrieve the results while I do
index, and the other way around?
Thanks.
regards,
Wooi Meng
--
View this message in context:
http://www.nabble.com/Perform-indexing-an
Le Samedi 13 Janvier 2007 16:48, Melange a écrit :
> Nicolas Lalevée-2 wrote:
> > Le Samedi 13 Janvier 2007 10:49, Melange a écrit :
> >> Hello, I'd like to index a web forum (phpBB) with Lucene. I wonder how
> >> to best map the forum document model (topics and their messages) to the
> >> Lucene
>
14 jan 2007 kl. 17.46 skrev Erick Erickson:
Map size, 10,000,000 pairs
Looking up 1,000,000 user ids and setting them in a bitset.
Total time to set all the bits, 1.016 seconds. Running inside of
Eclipse on
a 2700 MH AMD with 1G memory (and I used up almost all this memory,
but made
no
On 14. Jan 2007, at 17:46 , Erick Erickson wrote:
I just love it when I get so wrapped up in a particular approach that
alternatives don't occur to me. So I wondered what would happen if
I just
got stupid simple and tried solving what I think is your problem
without
involving lucene.
So,
I just love it when I get so wrapped up in a particular approach that
alternatives don't occur to me. So I wondered what would happen if I just
got stupid simple and tried solving what I think is your problem without
involving lucene.
So, I wrote a little program to fill up a HashMap with
pairs,
On 14. Jan 2007, at 3:54 , Erick Erickson wrote:
3> I doubt it really will make a performance difference, but you
could use
TermDocs.seek rather than get a new termdocs for each term from the
reader.
(and if this *does* make a difference, please let me know)
It seems it does. I have just
On 14. Jan 2007, at 8:51 , Doron Cohen wrote:
I think that one effective way to control docids changes, assuming
delete/update rate significantly lower than add rate, is to modify
Lucene
such that deleted docs are only 'squeezed out' when calling optimize
().
This would involve delicate cha
On 14. Jan 2007, at 10:58 , karl wettin wrote:
In the original post you mention 2-10 million documents. How much
is that is bytes?
On my development machine I have 1.5 million documents and those are
weighing in at
~950MB. I suspect that for production we will add more fields, so it
woul
On 14. Jan 2007, at 7:10 , Chris Hostetter wrote:
if you're talking about multiple identical servers used for load
balancing, then there is no reason why those indexes wouldn't be
kept in
sync (the merge model is deterministic, so if you apply the same
operations to every server in the same
14 jan 2007 kl. 02.14 skrev Kay Roepke:
If I was you, I would make a filter that navigates an in heap
object graph of all
users and their connections using a breadth first (or perhaps even
A*).
I would essentially have the same problem with a in-memory graph: I
cannot be sure
of the Luc
14 matches
Mail list logo