Re: Can you boost multiple terms using brackets ?

2010-01-20 Thread Otis Gospodnetic
Yes, I believe it is the same. I bet the Explain explanation would help confirm this. Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message > From: Paul Taylor > To: java-user@lucene.apache.org > Sent: Wed, January 20, 2010 1:03:14 PM > Subject: Can yo

Re: Lucene as a primary datastore

2010-01-20 Thread Otis Gospodnetic
Guido, No, you should absolutely not need to constantly rebuild the index. If you find you have to do that, you'll know you are doing something wrong. Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message > From: Guido Bartolucci > To: java-user@lucen

Solr Analysis Webinar Jan 28, 2010

2010-01-20 Thread Jay Hill
My colleague at Lucid Imagination, Tom Hill, will be presenting a free webinar focused on analysis in Lucene/Solr. If you're interested, please sign up and join us. Here is the official notice: We'd like to invite you to a free webinar our company is offering next Thursday, 28 January, at 2PM Eas

Re: Lucene as a primary datastore

2010-01-20 Thread Erick Erickson
It depends (tm). From what I've seen on this list, *if* the index gets corrupted, you'll see some exceptions somewhere. They may be head-scratchers, but you'll get exceptions. But when I've seen this kind of thing reported, it's been because of coding errors. Manually unlocking the IndexWriter and

Re: Lucene as a primary datastore

2010-01-20 Thread Jacob Rhoden
In the same way that you should take regular exports/dumps of your mysql databases, you could have the same strategy with lucene. As long as you have code that can export your data that runs daily, and code that can rebuild your index from that data, In the event of a problem the most you will

Re: Lucene as a primary datastore

2010-01-20 Thread Guido Bartolucci
Thanks for the response. I understand all of what you wrote, but what I care about and what I had a little trouble describing exactly in my previous question is: - Are all problems with Lucene obvious (e.g., you get an exception and you know your data is now bad) or are there subtle corruptions th

Re: Lucene as a primary datastore

2010-01-20 Thread Chris Lu
I have 3 concerns of making Lucene as a primary database. 1) Lucene is stable when it's stable. But you will have java exceptions. What would you do when FileNotFoundException or "Lucene 2.9.1 'read past EOF' IOException under system load" happens? For me, I don't the data is safe this way. Or, y

Can you boost multiple terms using brackets ?

2010-01-20 Thread Paul Taylor
Hi is title:(return panther)^3 alias:(return panther) the same as title:return^3 title:panther^3 alias:(return panther) thanks Paul - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional command

Re: Lucene as a primary datastore

2010-01-20 Thread Karl Wettin
20 jan 2010 kl. 04.58 skrev Guido Bartolucci: Am I just ignorant and scared of Lucene and too trusting of Oracle and MySQL? Since all your comparations is with relational databases I feel obligated to say what has been said so many times on this list: Lucene is an index and not a relatio

AW: NFS, Stale File Handle Problem and my thoughts....

2010-01-20 Thread Sertic Mirko, Bedag
Yeah! You made my day! I will post my new IndexDeletionPolicy, perhaps someone else can use it too! Regards Mirko -Ursprüngliche Nachricht- Von: Michael McCandless [mailto:luc...@mikemccandless.com] Gesendet: Mittwoch, 20. Januar 2010 17:38 An: java-user@lucene.apache.org Betreff: Re:

Re: NFS, Stale File Handle Problem and my thoughts....

2010-01-20 Thread Michael McCandless
You only have to create the deletion policy (merging uses it). Mike On Wed, Jan 20, 2010 at 11:27 AM, Sertic Mirko, Bedag wrote: > Ok, so does the merging go thru the IndexDeletionPolicy, or do I have to deal > with the MergePolicy to take care of merging? > > Regards > Mirko > > -Ursprüngl

AW: NFS, Stale File Handle Problem and my thoughts....

2010-01-20 Thread Sertic Mirko, Bedag
Ok, so does the merging go thru the IndexDeletionPolicy, or do I have to deal with the MergePolicy to take care of merging? Regards Mirko -Ursprüngliche Nachricht- Von: Michael McCandless [mailto:luc...@mikemccandless.com] Gesendet: Mittwoch, 20. Januar 2010 17:12 An: java-user@lucene.a

Re: NFS, Stale File Handle Problem and my thoughts....

2010-01-20 Thread Michael McCandless
Yes, normal merging will cause this problem as well. Generally you should always use IndexReader.reopen -- it gives much better reopen speed, less resources used, less GC, etc. Mike On Wed, Jan 20, 2010 at 10:49 AM, Sertic Mirko, Bedag wrote: > Mike > > Thank you so much for your feedback! > >

Re: TooManyClauses and maxClauseCount question

2010-01-20 Thread Benjamin Heilbrunn
Isn't maxClause count just a "best practice" limit to asure that performance doesn't decrease silently if big queries occur? Performance and memory consumption should depend on how many clauses are really used / number of matching documents I think that there is no (significant) difference in memor

AW: NFS, Stale File Handle Problem and my thoughts....

2010-01-20 Thread Sertic Mirko, Bedag
Mike Thank you so much for your feedback! Will the new IndexDeletionPolicy also be considered when segments are merged? Does merging also affect the NFS problem? Should I use IndexReader.reOpen() or just create a new IndexReader? Thanks in advance Mirko -Ursprüngliche Nachricht- Von:

Re: NFS, Stale File Handle Problem and my thoughts....

2010-01-20 Thread Michael McCandless
Sounds great! Mike On Wed, Jan 20, 2010 at 10:25 AM, Shai Erera wrote: > Ok, I haven't reached that part in LIA2 yet :-). > > This module is useful for a single node as well, when one IndexSearcher is > shared between several threads. The communication part is just an extension > of that case. >

Re: NFS, Stale File Handle Problem and my thoughts....

2010-01-20 Thread Shai Erera
Ok, I haven't reached that part in LIA2 yet :-). This module is useful for a single node as well, when one IndexSearcher is shared between several threads. The communication part is just an extension of that case. I'll review the SearcherManager in LIA2, and compare to our code. If it'll make sen

Re: NFS, Stale File Handle Problem and my thoughts....

2010-01-20 Thread Michael McCandless
I think this would be useful! The 2nd edition Lucene in Action sources also have something similar, a SearcherManager class that handles multiple threads doing searching while a reopen (normal or NRT) and warming is taking place. (NOTE: I'm one of the authors on Lucene in Action 2nd edition!). B

Re: NFS, Stale File Handle Problem and my thoughts....

2010-01-20 Thread Michael McCandless
Right, it's only machine A that needs the deletion policy. All read-only machines just reopen on their schedule (or you can use some communication means a Shai describes to have lower latency reopen after the writer commits). Also realize that doing searching over NFS does not usually give very g

TooManyClauses and maxClauseCount question

2010-01-20 Thread john smith
Hi I'am getting TooManyClauses exception while performing wildcard query. I'am thinking about changing max clause count limit (BooleanQuery.setMaxClauseCount() method). My question referes to memory consumption in case of increasing maxClauseCount parameter. Does Lucene do it in a smart way (

Re: NFS, Stale File Handle Problem and my thoughts....

2010-01-20 Thread Shai Erera
We've worked around that problem by doing two things: 1) We notify all nodes in the cluster when the index has committed (we use JMS for that). 2) On each node there is a daemon which waits on this JMS queue, and once the index has committed it reopens an IR, w/o checking isCurrent(). I think that

AW: Lucene as a primary datastore

2010-01-20 Thread Sertic Mirko, Bedag
Hi Did you ever think about a Content Repository, like JackRabbit or Alfresco? Alfresco generates also a Lucene Index for documents stored in its repository. The content repository itself is backed by a database or a filesystem... Regards Mirko -Ursprüngliche Nachricht- Von: Darren Hart

RE: Lucene as a primary datastore

2010-01-20 Thread Darren Hartford
My two cents is no, not to use lucene as a primary datastore. Although there are some datastores that look similar to lucene who define themselves as primary datastores (the 'nosql' style datastores), I would put lucene besides the likes of RRD and other specifically purposed information stores th

AW: NFS, Stale File Handle Problem and my thoughts....

2010-01-20 Thread Sertic Mirko, Bedag
Hi Mike Thank you for your feedback! So I would need the following setup: a) Machine A with custom IndexDeletionPolicy and single IndexReader instance b) Machine B with custom IndexDeletionPolicy and single IndexReader instance c) Machine A and B periodically check if the index needs to be reope

Re: Lucene as a primary datastore

2010-01-20 Thread Erick Erickson
My preference is to put the effort into preserving the original source on the theory that I'm sure no information is lost that way. So the suitability of Lucene to store it varies depending upon the source IMO. If it's raw text, then storing all the raw text in an un-indexed field in Lucene might

Re: NFS, Stale File Handle Problem and my thoughts....

2010-01-20 Thread Michael McCandless
Right, you just need to make a custom IndexDeletionPolicy. NFS makes no effort to protect deletion of still-open files. A simple approach is one that only deletes a commit if it's more than XXX minutes/hours old, such that XXX is set higher than the frequency that IndexReaders are guaranteed to h

NFS, Stale File Handle Problem and my thoughts....

2010-01-20 Thread Sertic Mirko, Bedag
h...@all We are using Lucene 2.4.1 on Debian Linux with 2 boxes. The index is stored on a common NFS share. Every box has a single IndexReader instance, and one Box has an IndexWriter instance, adding new documents or deleting existing documents at a given point in time. After adding or deletin

Re: incremental document field update

2010-01-20 Thread Michael McCandless
On Tue, Jan 19, 2010 at 10:45 PM, Babak Farhang wrote: >> I see -- so your file format allows you to append to the same file >> without affecting prior readers?  We never do that in Lucene today >> (all files are "write once"). > > Yes. For the most part it only appends. The exception is when the

Re: Lucene as a primary datastore

2010-01-20 Thread Chris Harris
I don't do a lot of work with straight Lucene right now, but I do use Solr, and from time to time the Lucene index inside my master Solr server gets corrupted; in particular, some of the Lucene segment files that are still in use somehow get deleted, resulting in Lucene throwing FileNotFoundExcepti

Re: Lucene as a primary datastore

2010-01-20 Thread Király Péter
Hi, I am using Lucene for the same purpose since years. I import an XML files with records, and in Lucene there is a special field, which stores the original XML (this used for displaying with XSLT), the other fields are for searching. There is a webform, where the users can modify the data. If us