date:20070628

Re: Can I delete without shuffling document IDs?

2007-06-28 Thread karl wettin

29 jun 2007 kl. 05.08 skrev Daniel Noll: I just wanted to put the question out in case someone has solved the exact same problem already. I've posted some experiments in the LUCENE-879. The patch replace delted documents with a new dummy document. The second patch contains some merge

Re: Lucene as primary object storage

2007-06-28 Thread karl wettin

28 jun 2007 kl. 15.37 skrev Emmanuel Bernard: I don't really like the idea actually: I'm much comfortable with having my data in a relational DB :) If you don't mind, please develop that a bit further. I think Lucene is suited pretty well for object storage if you also need it as an index.

Can I delete without shuffling document IDs?

2007-06-28 Thread Daniel Noll

Hi all. Is there currently any way to delete documents from the middle of a text index without a risk of the document IDs changing later? I'm aware that they probably won't change unless we optimise or unless the user adds more data, but unfortunately adding more data is now a potential occurr

Re: Adding Documents to index in a batch process

2007-06-28 Thread Kai Weber

* Erick Erickson <[EMAIL PROTECTED]>: > I guess I don't understand the problem. Can you build the documents > from within a loop or not? If you can, it's simple... > > open indexwriter > while (build a document) >write to index > > close/optimize. > > Or are you saying that you can't build f

Re: Rewrite one phrase to another in search query

2007-06-28 Thread Mark Miller

You might try my Query Parser, Qsol. http://myhardshadow.com/qsol.php There is a find/replace feature that will do what you want. FindReplace takes the find string, the replace string, boolean for case sensitive, boolean to indicate the replacement will act as an operator (allows for correct de

Re: Lucene as primary object storage

2007-06-28 Thread Otis Gospodnetic

Karl, you might want to have a look at Zoe (the email app from several years ago that uses Lucene as its storage). Also, there is DbDirectory for Lucene, which should have XA support. Andi will know. Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com

Re: Searching over multiple indexes with 1:m relationship

2007-06-28 Thread Erick Erickson

Chris is spot-on. Your data set is so small that I wouldn't worry about speed unless and until you have proof that it's a problem. The complexity you'll introduce by having multiple indexes just won't be worth it. In your case, following Chris's advice and de-normalizing the data would be the fir

Re: LUCENE on Eclipse

2007-06-28 Thread Chris Hostetter

When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if you change the subject line of your email, other mail headers still track which thread you replied to and your question is "hidden" in that thread and gets less atten

Re: queryparser

2007-06-28 Thread Erik Hatcher

On Jun 28, 2007, at 1:29 PM, pratik shinghal wrote: i m using lucene(org.apache.lucene) and i want the java code for parsing single character string.. my code is : QueryParser qp = new QueryParser("",analyser); String str = " track 9"; Query que = qp.parse(str); System.out.println(que);

Re: queryparser

2007-06-28 Thread Erick Erickson

What do you get if you do a System.out.println(que.toString())? And what analyzer are you using? Erick On 6/28/07, pratik shinghal <[EMAIL PROTECTED]> wrote: i m using lucene(org.apache.lucene) and i want the java code for parsing single character string.. my code is : QueryParser qp = new

Re: Adding Documents to index in a batch process

2007-06-28 Thread Erick Erickson

I guess I don't understand the problem. Can you build the documents from within a loop or not? If you can, it's simple... open indexwriter while (build a document) write to index close/optimize. Or are you saying that you can't build from within a loop? Best Erick On 6/28/07, Kai Weber <[E

Re: inserting millions of entries

2007-06-28 Thread Erick Erickson

Yes, opening/closing will be very costly. But I *believe*, although I haven't tried it, that IndexModifier (2.1) will work for you. But do NOT take my word for it as I haven't tried to do what you're doing. But it should be easy to write a short test or two to prove that you can find recently-ins

Re: Luke faster + Index Searcher is slow

2007-06-28 Thread Chris Hostetter

: Are you opening the IndexSearcher every time you query? This is a : costly operation. just repeating the above line because it's important. also... : > The code i use is : > File indexFile = new File(fileName); : >FSDirectory dir = FSDirectory.getDirecto

Re: Searching over multiple indexes with 1:m relationship

2007-06-28 Thread Chris Lu

What you should do is denorm the 1:m relationships. Don't try to mimic the database. If you need to, you can keep the original 2 indexes and create a third one. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo:

Re: Scaling up to several machines with Lucene

2007-06-28 Thread Chris Lu

Basically you need to separate your web app from your searching, for a scalable solution. Searching is a different concern. You can develop more kinds of search when new requirement comes in. Technorati's way is very similar to one of DBSight configuration. One machine is dedicated for indexing,

queryparser

2007-06-28 Thread pratik shinghal

i m using lucene(org.apache.lucene) and i want the java code for parsing single character string.. my code is : QueryParser qp = new QueryParser("",analyser); String str = " track 9"; Query que = qp.parse(str); System.out.println(que); and i want the answer as :track , 9 but i m gett

Re: Scaling up to several machines with Lucene

2007-06-28 Thread Grant Ingersoll

Hadoop is not designed for this type of scenario. Have a look at Solr (http://lucene.apache.org/solr), this is pretty much one of it's main use cases. I think it will do what you need to do and will more than likely work w/ a minimal of configuration on your existing index (but don't hold

Re: Luke faster + Index Searcher is slow

2007-06-28 Thread Grant Ingersoll

Are you opening the IndexSearcher every time you query? This is a costly operation. -Grant On Jun 28, 2007, at 12:03 PM, Nott wrote: I have an index in one file that has a size of abt 18GB of data When i run some queries on Luke the response comes in < 40 ms but the same when I use Inde

Adding Documents to index in a batch process

2007-06-28 Thread Kai Weber

Hello, In my application I have to add documents to the index as follows: 1. build the document to add from a repository 2. obtain an IndexWriter 2. add document to index 4. write and optimize index, close writer 5. goto 1 until no documents left I must work with a legacy code witch does the doc

Luke faster + Index Searcher is slow

2007-06-28 Thread Nott

I have an index in one file that has a size of abt 18GB of data When i run some queries on Luke the response comes in < 40 ms but the same when I use IndexSearcher gives me in 300ms -600 ms Any suggestions ? The code i use is File indexFile = new File(fileName);

Re: several existential issues about Lucene's filesystem

2007-06-28 Thread Grant Ingersoll

On Jun 28, 2007, at 9:06 AM, Samuel LEMOINE wrote: Grant Ingersoll a écrit : On Jun 28, 2007, at 5:29 AM, Samuel LEMOINE wrote: Thanks for the resources about payloads, I'll have a look over it. About the positions/offsets in .tvf, please tell me if I've well understood: The . (quote) Fi

Re: inserting millions of entries

2007-06-28 Thread Mathieu Lecarme

stop writing scp index to another computer play with it scp indexModified to the server mv indexModified indexCurrent all done. mv is atomic. Jens Grivolla a écrit : > Hi, > > I have a Lucene index with a few million entries, and I will need to > add batches of a few hundred thousand or a few mil

inserting millions of entries

2007-06-28 Thread Jens Grivolla

Hi, I have a Lucene index with a few million entries, and I will need to add batches of a few hundred thousand or a few million additional entries. Unfortunately, I absolutely need to have all indexed entries available when inserting a new one, even within one batch, in order to do some duplicat

AW: Searching over multiple indexes with 1:m relationship

2007-06-28 Thread Michael Böckling

Hi Erickson, thanks for your reply. Of course you are right that its a bit insane to mimic a database-schema with indices, but thats how it is. The primary index is already in use, the extended requirements came later. The Index isn't really that big, the primary one has 2-3 MB of data, I don't

Re: Scaling up to several machines with Lucene

2007-06-28 Thread Mathieu Lecarme

Samuel LEMOINE a écrit : > I'm acutely interrested by this issue too, as I'm working on > distributed architecture of Lucene. I'm only at the very beginning of > my study so that I can't help you much, but Hadoop maybe could fit to > your requirements. It's a sub-project of Lucene aiming to paralle

Re: Scaling up to several machines with Lucene

2007-06-28 Thread Samuel LEMOINE

Chun Wei Ho a écrit : Hi, We are currently running a Tomcat web application serving searches over our Lucene index (10GB) on a single server machine (Dual 3GHz CPU, 4GB RAM). Due to performance issues and to scale up to handle more traffic/search requests, we are getting another server machine.

Re: Scaling up to several machines with Lucene

2007-06-28 Thread Mathieu Lecarme

Server One handle website Server Two is a light version of tomcat wich handle Lucene Search In front, a lighttpd which use server two for /search, and server one for all others things You can add lucene server with round robin in lighttpd with this scheme. Careful with fault tolerance and index

Re: Searching over multiple indexes with 1:m relationship

2007-06-28 Thread Erick Erickson

I do have an off-the-wall question.. Why have two indexes? There are, of course, good reasons, but they're things like size and speed. Where I'm going here is that Lucene does NOT require that all documents have the same fields. So it's perfectly reasonable to index heterogeneous data (or differi

Scaling up to several machines with Lucene

2007-06-28 Thread Chun Wei Ho

Hi, We are currently running a Tomcat web application serving searches over our Lucene index (10GB) on a single server machine (Dual 3GHz CPU, 4GB RAM). Due to performance issues and to scale up to handle more traffic/search requests, we are getting another server machine. We are looking at two

Searching over multiple indexes with 1:m relationship

2007-06-28 Thread Michael Böckling

Hi folks! I know there is a MultiSearcher for searching over multiple indices, but my requirement is a bit special. I have two indices whose documents have a 1:m relationship. Most queries will only use the primary index, but some will have to look for detailed information in the secondary index (

Re: Lucene as primary object storage

2007-06-28 Thread Emmanuel Bernard

Hibernate Search (formerly known as Hibernate Lucene) is not designed to use Lucene as the primary and only backend. It is designed to complement a database. I don't really like the idea actually: I'm much comfortable with having my data in a relational DB :) So this product will not help f

Re: several existential issues about Lucene's filesystem

2007-06-28 Thread Samuel LEMOINE

Grant Ingersoll a écrit : On Jun 28, 2007, at 5:29 AM, Samuel LEMOINE wrote: Thanks for the resources about payloads, I'll have a look over it. About the positions/offsets in .tvf, please tell me if I've well understood: The .tvd provides the needed informations concerning the occurrences of

LUCENE on Eclipse

2007-06-28 Thread spilirit

hello; i would like if you could help me finding some documentation about how to import lucene source into eclipse IDE. I'm a new user for this API, and i would like to learn how to use it as i seams powerful... Thank you for your answers. I would be very grateful if somebody have any tutorial

Re: several existential issues about Lucene's filesystem

2007-06-28 Thread Grant Ingersoll

On Jun 28, 2007, at 5:29 AM, Samuel LEMOINE wrote: Thanks for the resources about payloads, I'll have a look over it. About the positions/offsets in .tvf, please tell me if I've well understood: The .tvd provides the needed informations concerning the occurrences of each term in documents, a

Re: several existential issues about Lucene's filesystem

2007-06-28 Thread Samuel LEMOINE

Grant Ingersoll a écrit : On Jun 27, 2007, at 8:51 AM, Samuel LEMOINE wrote: Hi everyone ! I'm working on bibliographical researches on Lucene as an intern in Lingway (which uses Lucene in its main product), and I'm currently studying Lucene's file system. There are several things I don't c

Re: Can I delete without shuffling document IDs?

Re: Lucene as primary object storage

Can I delete without shuffling document IDs?

Re: Adding Documents to index in a batch process

Re: Rewrite one phrase to another in search query

Re: Lucene as primary object storage

Re: Searching over multiple indexes with 1:m relationship

Re: LUCENE on Eclipse

Re: queryparser

Re: queryparser

Re: Adding Documents to index in a batch process

Re: inserting millions of entries

Re: Luke faster + Index Searcher is slow

Re: Searching over multiple indexes with 1:m relationship

Re: Scaling up to several machines with Lucene

queryparser

Re: Scaling up to several machines with Lucene

Re: Luke faster + Index Searcher is slow

Adding Documents to index in a batch process

Luke faster + Index Searcher is slow

Re: several existential issues about Lucene's filesystem

Re: inserting millions of entries

inserting millions of entries

AW: Searching over multiple indexes with 1:m relationship

Re: Scaling up to several machines with Lucene

Re: Scaling up to several machines with Lucene

Re: Scaling up to several machines with Lucene

Re: Searching over multiple indexes with 1:m relationship

Scaling up to several machines with Lucene

Searching over multiple indexes with 1:m relationship

Re: Lucene as primary object storage

Re: several existential issues about Lucene's filesystem

LUCENE on Eclipse

Re: several existential issues about Lucene's filesystem

Re: several existential issues about Lucene's filesystem

35 matches

Site Navigation

Mail list logo

Footer information