29 jun 2007 kl. 05.08 skrev Daniel Noll:
I just wanted to put the question out in case someone has solved
the exact
same problem already.
I've posted some experiments in the LUCENE-879. The patch replace
delted documents with a new dummy document. The second patch contains
some merge
28 jun 2007 kl. 15.37 skrev Emmanuel Bernard:
I don't really like the idea actually: I'm much comfortable with
having my data in a relational DB :)
If you don't mind, please develop that a bit further.
I think Lucene is suited pretty well for object storage if you also
need it as an index.
Hi all.
Is there currently any way to delete documents from the middle of a text index
without a risk of the document IDs changing later? I'm aware that they
probably won't change unless we optimise or unless the user adds more data,
but unfortunately adding more data is now a potential occurr
* Erick Erickson <[EMAIL PROTECTED]>:
> I guess I don't understand the problem. Can you build the documents
> from within a loop or not? If you can, it's simple...
>
> open indexwriter
> while (build a document)
>write to index
>
> close/optimize.
>
> Or are you saying that you can't build f
You might try my Query Parser, Qsol. http://myhardshadow.com/qsol.php
There is a find/replace feature that will do what you want. FindReplace
takes the find string, the replace string, boolean for case sensitive,
boolean to indicate the replacement will act as an operator (allows for
correct de
Karl, you might want to have a look at Zoe (the email app from several years
ago that uses Lucene as its storage).
Also, there is DbDirectory for Lucene, which should have XA support. Andi will
know.
Otis
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com
Chris is spot-on. Your data set is so small that I wouldn't worry about
speed unless and until you have proof that it's a problem. The complexity
you'll introduce by having multiple indexes just won't be worth it.
In your case, following Chris's advice and de-normalizing the data would
be the fir
When starting a new discussion on a mailing list, please do not reply to
an existing message, instead start a fresh email. Even if you change the
subject line of your email, other mail headers still track which thread
you replied to and your question is "hidden" in that thread and gets less
atten
On Jun 28, 2007, at 1:29 PM, pratik shinghal wrote:
i m using lucene(org.apache.lucene) and i want the java code for
parsing
single character string..
my code is :
QueryParser qp = new QueryParser("",analyser);
String str = " track 9";
Query que = qp.parse(str);
System.out.println(que);
What do you get if you do a
System.out.println(que.toString())?
And what analyzer are you using?
Erick
On 6/28/07, pratik shinghal <[EMAIL PROTECTED]> wrote:
i m using lucene(org.apache.lucene) and i want the java code for parsing
single character string..
my code is :
QueryParser qp = new
I guess I don't understand the problem. Can you build the documents
from within a loop or not? If you can, it's simple...
open indexwriter
while (build a document)
write to index
close/optimize.
Or are you saying that you can't build from within a loop?
Best
Erick
On 6/28/07, Kai Weber <[E
Yes, opening/closing will be very costly. But I *believe*, although I
haven't tried it, that IndexModifier (2.1) will work for you.
But do NOT take my word for it as I haven't tried to do what you're doing.
But it should be easy to write a short test or two to prove that you can
find recently-ins
: Are you opening the IndexSearcher every time you query? This is a
: costly operation.
just repeating the above line because it's important. also...
: > The code i use is
: > File indexFile = new File(fileName);
: >FSDirectory dir = FSDirectory.getDirecto
What you should do is denorm the 1:m relationships.
Don't try to mimic the database. If you need to, you can keep the original 2
indexes and create a third one.
--
Chris Lu
-
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo:
Basically you need to separate your web app from your searching, for a
scalable solution. Searching is a different concern. You can develop more
kinds of search when new requirement comes in.
Technorati's way is very similar to one of DBSight configuration. One
machine is dedicated for indexing,
i m using lucene(org.apache.lucene) and i want the java code for parsing
single character string..
my code is :
QueryParser qp = new QueryParser("",analyser);
String str = " track 9";
Query que = qp.parse(str);
System.out.println(que);
and i want the answer as :track , 9
but i m gett
Hadoop is not designed for this type of scenario.
Have a look at Solr (http://lucene.apache.org/solr), this is pretty
much one of it's main use cases. I think it will do what you need to
do and will more than likely work w/ a minimal of configuration on
your existing index (but don't hold
Are you opening the IndexSearcher every time you query? This is a
costly operation.
-Grant
On Jun 28, 2007, at 12:03 PM, Nott wrote:
I have an index in one file that has a size of abt 18GB of data
When i run some queries on Luke the response comes in < 40 ms but
the same
when I use Inde
Hello,
In my application I have to add documents to the index as follows:
1. build the document to add from a repository
2. obtain an IndexWriter
2. add document to index
4. write and optimize index, close writer
5. goto 1 until no documents left
I must work with a legacy code witch does the doc
I have an index in one file that has a size of abt 18GB of data
When i run some queries on Luke the response comes in < 40 ms but the same
when I use IndexSearcher gives me in 300ms -600 ms
Any suggestions ?
The code i use is
File indexFile = new File(fileName);
On Jun 28, 2007, at 9:06 AM, Samuel LEMOINE wrote:
Grant Ingersoll a écrit :
On Jun 28, 2007, at 5:29 AM, Samuel LEMOINE wrote:
Thanks for the resources about payloads, I'll have a look over it.
About the positions/offsets in .tvf, please tell me if I've well
understood:
The .
(quote)
Fi
stop writing
scp index to another computer
play with it
scp indexModified to the server
mv indexModified indexCurrent
all done. mv is atomic.
Jens Grivolla a écrit :
> Hi,
>
> I have a Lucene index with a few million entries, and I will need to
> add batches of a few hundred thousand or a few mil
Hi,
I have a Lucene index with a few million entries, and I will need to
add batches of a few hundred thousand or a few million additional
entries. Unfortunately, I absolutely need to have all indexed entries
available when inserting a new one, even within one batch, in order to
do some duplicat
Hi Erickson,
thanks for your reply.
Of course you are right that its a bit insane to mimic a database-schema
with indices, but thats how it is. The primary index is already in use, the
extended requirements came later.
The Index isn't really that big, the primary one has 2-3 MB of data, I don't
Samuel LEMOINE a écrit :
> I'm acutely interrested by this issue too, as I'm working on
> distributed architecture of Lucene. I'm only at the very beginning of
> my study so that I can't help you much, but Hadoop maybe could fit to
> your requirements. It's a sub-project of Lucene aiming to paralle
Chun Wei Ho a écrit :
Hi,
We are currently running a Tomcat web application serving searches
over our Lucene index (10GB) on a single server machine (Dual 3GHz
CPU, 4GB RAM). Due to performance issues and to scale up to handle
more traffic/search requests, we are getting another server machine.
Server One handle website
Server Two is a light version of tomcat wich handle Lucene Search
In front, a lighttpd which use server two for /search, and server one
for all others things
You can add lucene server with round robin in lighttpd with this scheme.
Careful with fault tolerance and index
I do have an off-the-wall question.. Why have two indexes? There
are, of course, good reasons, but they're things like size and speed.
Where I'm going here is that Lucene does NOT require that all
documents have the same fields. So it's perfectly reasonable to index
heterogeneous data (or differi
Hi,
We are currently running a Tomcat web application serving searches
over our Lucene index (10GB) on a single server machine (Dual 3GHz
CPU, 4GB RAM). Due to performance issues and to scale up to handle
more traffic/search requests, we are getting another server machine.
We are looking at two
Hi folks!
I know there is a MultiSearcher for searching over multiple indices, but my
requirement is a bit special.
I have two indices whose documents have a 1:m relationship. Most queries
will only use the primary index, but some will have to look for detailed
information in the secondary index (
Hibernate Search (formerly known as Hibernate Lucene) is not designed
to use Lucene as the primary and only backend. It is designed to
complement a database.
I don't really like the idea actually: I'm much comfortable with
having my data in a relational DB :)
So this product will not help f
Grant Ingersoll a écrit :
On Jun 28, 2007, at 5:29 AM, Samuel LEMOINE wrote:
Thanks for the resources about payloads, I'll have a look over it.
About the positions/offsets in .tvf, please tell me if I've well
understood:
The .tvd provides the needed informations concerning the occurrences
of
hello;
i would like if you could help me finding some documentation about how
to import lucene source into eclipse IDE. I'm a new user for this API,
and i would like to learn how to use it as i seams powerful...
Thank you for your answers.
I would be very grateful if somebody have any tutorial
On Jun 28, 2007, at 5:29 AM, Samuel LEMOINE wrote:
Thanks for the resources about payloads, I'll have a look over it.
About the positions/offsets in .tvf, please tell me if I've well
understood:
The .tvd provides the needed informations concerning the
occurrences of each term in documents, a
Grant Ingersoll a écrit :
On Jun 27, 2007, at 8:51 AM, Samuel LEMOINE wrote:
Hi everyone !
I'm working on bibliographical researches on Lucene as an intern in
Lingway (which uses Lucene in its main product), and I'm currently
studying Lucene's file system.
There are several things I don't c
35 matches
Mail list logo