Hi,
I guess (but 'm not quite sure) you are looking for a way to
incrementally index(+update existing index), there would be a lot info
available on the same.
What I would suggest would be deleting the indexes from the current
index using deleteDocuments
(http://lucene.apache.org/java/2_3_1/api/o
Have you looked at Nutch or Hadoop? They are subprojects of Lucene,
developed specifically to support large-scale, distributed indexing.
Nutch is probably more mature whereas Hadoop supports clustering out of
the box...
ND
-Original Message-
From: Rajesh parab [mailto:[EMAIL PROTECTED]
Hi,
We are currently using Lucene 2.0 for full-text
searches within our enterprise application, which can
be deployed in clustered environment. We generate
Lucene index for data stored inside relational
database.
As Lucene 2.0 did not have solid NFS support and as we
wanted Lucene based searches
Thanks !!
-Original Message-
From: Donna L Gresh [mailto:[EMAIL PROTECTED]
Sent: Wednesday, April 02, 2008 11:52 AM
To: java-user@lucene.apache.org
Subject: Re: Adding attribute to index
This is "fast and loose" code (from my head; check the syntax). I
*highly* recommend you get a copy
Dominique Béjean wrote:
http://today.java.net/pub/a/today/2005/08/09/didyoumean.html
-Message d'origine-
De : Marjan Celikik [mailto:[EMAIL PROTECTED]
Envoyé : jeudi 3 avril 2008 15:12
À : java-user@lucene.apache.org
Objet : Error tolerant text search with Lucene?
Hi everyone,
I kn
http://today.java.net/pub/a/today/2005/08/09/didyoumean.html
-Message d'origine-
De : Marjan Celikik [mailto:[EMAIL PROTECTED]
Envoyé : jeudi 3 avril 2008 15:12
À : java-user@lucene.apache.org
Objet : Error tolerant text search with Lucene?
Hi everyone,
I know that there are packages
Is there any reliable implementation for parsing email mailbox files (mbox
format), especially large (>50MB) archives ? Even after searching lucene
mailing list archives, googling around, I couldn't find one. I took a look
at Apache James project which seems to offer some support , but couldn't
fin
Is there any reliable implementation for parsing email mailbox files (mbox
format), especially large (>50MB) archives ? Even after searching lucene
mailing list archives, googling around, I couldn't find one. I took a look
at Apache James project which seems to offer some support , but couldn't
fin
Could you explain your use case? Because to say that you want to
score documents that don't have all the terms with a *phrase query*
is contradictory. The point of a phrase query is exactly that all
the terms are there and within some some proximity.
Best
Erick
On Thu, Apr 3, 2008 at 12:17 P
Apparently tp.nextPosition() is needed :(
Any ideas?
-John
On Thu, Apr 3, 2008 at 8:20 AM, John Wang <[EMAIL PROTECTED]> wrote:
> I am loading both from disk.
> But I found the culprit:
>
> My code:
>
> while (tp.next())
>
> {
>
> //assert tp.doc() < maxDoc;
>
> tp.
Hi!
I'm using Lucene Proximity Searches, but I've seen Lucene only scores
documents which contain all the terms in the phrase. I also need to score
documents although they don't contain all those terms. Is it possible with
Lucene PhraseQueries or SpanNearQuery? If not, could you tell me a way to
Hi!
I'm using Lucene Proximity Searches, but I've seen Lucene only scores
documents which contain all the terms in the phrase. I also need to score
documents although they don't contain all those terms. Is it possible with
Lucene PhraseQueries or SpanNearQuery? If not, could you tell me a way to
I am loading both from disk.
But I found the culprit:
My code:
while (tp.next())
{
//assert tp.doc() < maxDoc;
tp.nextPosition(); <-- this call is the problem
tp.getPayload(payloadBuffer, 0);
byter.load(_array, tp.doc(), payloadBuffe
Hi everyone,
I know that there are packages that support the "Did you mean ... ?"
search features with lucene which tries to find the most suited
correct-word query.. however, so far I haven't encountered the opposite
search feature: given a correct query, find all documents which contain
misspel
If your index size grows larger, payload method would be more slower.
It's because Payload are read from hard disk. Fieldcache is in the
memory, which is much faster.
Unless you are going with Solid State Disk, you'd better go with
Fieldcache for faster search.
--
Chris Lu
--
You could try something like this, which use when I put my own documents
together:
public Document getDocument(){
Document doc = new Document();
doc.add(new Field("db_key", this.getDb_key(), Field.Store.YES,
Field.Index.UN_TOKENIZED));
doc.add(new Field("ac
В сообщении от Thursday 03 April 2008 16:24:15 Илья Казначеев написал(а):
> - Is there a way to set weights for different fields? Let's say, content
> have a weight of 1, title have a weight of 5 and picture subscribe have a
> weight of 0.5. If no, can I do that by hand?
Already found field.setBoo
Sorry, gmail was screwy and accidentally sent the msg.
Anyway,
I have a large index, about 30M docs.
I have a date field (by days) and there are about 1000 of them, every doc
has a date field filled in.
So out of curiosity I index the date field two ways:
1) using "date" as a field, and set the d
Hi:
Hi everyone,
I know that there are packages that support the "Did you mean ... ?"
search features with lucene which tries to find the most suited
correct-word query.. however, so far I haven't encountered the opposite
search feature: given a correct query, find all documents which contain
mis
I also faced same problem in past.
But in my case the index size was not the issue so i maintained two folder
"newindex" and "oldindex"... and swaping at every update.
-Bhavin pandya
- Original Message -
From: "021336" <[EMAIL PROTECTED]>
To:
Sent: Tuesday, April 01, 2008 9:44 PM
Su
One interpretation of the query with ~5 is that your text has 5 words
and ~5 would imply a word in any position can match. Could it be this?
- Original Message -
From: "Ivan Vasilev" <[EMAIL PROTECTED]>
To: "LUCENE MAIL LIST"
Sent: Thursday, April 03, 2008 6:03 AM
Subject: PhraseQuery
Hello.
We've designing a CMS in Java, and I've trying to implement site search
function using lucene.
The basic conception is that:
- Site features numerous objects that we'd like to throw into index: pages,
various text blocks on those pages, descriptions and keyword lists of those
pages, sta
ds, as well as, in
between words – then THE ORDER of the searched words does not matter.
Best Regards,
Ivan
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
__ NOD32 2
Hi Guys,
I make the following test – I create 2 files. File1.txt with content:
“apple 2 3 4 pear”
And File2.txt with content:
“pear 2 3 4 apple”
I made the following searching tests:
1. Using Luke Search tab.
1.1. When searching for:
content:"pear apple"~3
Then the File1.txt is returned.
1.2.
25 matches
Mail list logo