Re: NO_NORMS and TOKENIZED?

2007-02-20 Thread Chris Hostetter
: > en if i wanted to be able to use : > an option on field foo for some docs and not on others i'd have to : > have : > foo_optOFF and foo_optON ... then anytime i wanted to search on : > "foo" i'd : > have to use a booleanquery without a coord factor across both. : : I'm trying to think of an ex

Re: Using Lucene - Design Question

2007-02-20 Thread Venkat Seeth
You could also use Hadoop RPC or ICE (www.zeroc.com) I'm on that path now. --Venkat --- shai deljo <[EMAIL PROTECTED]> wrote: > I considered getting Lucene in action but figured > I'll wait for the > DVD to come out ;). > Seriously though, they write about RemoteSearchable > and use RMI, Is >

Re: Using Lucene - Design Question

2007-02-20 Thread orion
If you'd like to try using Terracotta, we (Terracotta) would be glad to help you out. If you want more info, you can email me directly (orion at terracotta.org) or you can use our web forums (http://forums.terracotta.org) or our user mailing list (http://lists.terracotta.org/) Cheers, Orion s

Re: Using Lucene - Design Question

2007-02-20 Thread shai deljo
I considered getting Lucene in action but figured I'll wait for the DVD to come out ;). Seriously though, they write about RemoteSearchable and use RMI, Is this the recommended solution? does it scale well? Thanks On 2/20/07, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: Well, there is also a Rem

Re: Running Lucene as a stateless session bean

2007-02-20 Thread Doron Cohen
Is that perhaps Lucene 1.4.3? (Current release is 2.1.0, I am not aware of 1.3, such old version is not even in the releases archives). The static parse() was deprecated at 1.9 and removed at 2.0, so it must be Lucene 1.9 or older. Anyhow, at least from Lucene point of view (I am not familiar wit

Re: Running Lucene as a stateless session bean

2007-02-20 Thread markharw00d
Be careful with your use of GATE and multiple threads. I recently had some trouble with their Factory.delete.. methods which ended up requiring a change to the core and this was applied to the 4.0 trunk. A 3.1 patch has not been released so you'll need to be using the latest from SVN (now requi

Re: Using Lucene - Design Question

2007-02-20 Thread Otis Gospodnetic
Well, there is also a Remote cousin there. That will let you distribute your indices over N severs (sounds like you'll need multiple). You should really take a stroll through Lucene's javadoc, it's incredibly nice now in winter time. Or ... clears throat you could get a book ;) Otis . .

Running Lucene as a stateless session bean

2007-02-20 Thread Walker, Keith 1
I'm using an EJB to process documents using Lucene 1.3. Things are working fine now, but I wanted to double check that this will work with multiple instances of the EJB. I know this is not conforming to the EJB spec concerning file I/O, but ignoring that for now, my question is about thread safe

Re: Using Lucene - Design Question

2007-02-20 Thread shai deljo
Hi, Thanks for the reply. * Regarding hardware I'll use something similar to: Core 2 Duo - 2.66GHz, 2x300 GB disk drives, 4 GB RAM running on one of the Linux distributions. * Regarding response time I'm looking to be ~300 milliseconds for at least 80% of queries and ~500 milliseconds for 95% of q

Re: Search for a term in all fields

2007-02-20 Thread Chris Hostetter
The information Erick gave you when you asked this question yesterday is all very accurate -- the one addition i would make is that you don't need SpanNear queries to take advantage of positionINcrimentGap -- PhraseQueries do that to. Consolidating your fields into a single "all" field, or constr

Re: Neuter fields in query

2007-02-20 Thread Mark Miller
Thanks for the info Erick. Unfortunately, I need to be able to do this with any old Query that is supplied to me...no access to the 'raw' query :( I also need the process to be relatively cheap which is why I don't wan to deconstruct and reconstruct. - Mark Erick Erickson wrote: Goll, Mar

Re: Neuter fields in query

2007-02-20 Thread Erick Erickson
Goll, Mark, it sounds like we're working on very similar projects . I solved something similar by storing the relevant parts of the original query away ahead of time in a format suitable for using with the MemoryIndex. It was lots easier to handle that way in my app, since I got in unfielded d

Re: Lucene 2.1 and Luke

2007-02-20 Thread Erick Erickson
Thanks all, this gets me "far enough" for a while. And Andrezj, I can't thank you enough for writing and maintaining this tool, it's saved me endless grief over the past year! Best Erick On 2/20/07, Koji Sekiguchi <[EMAIL PROTECTED]> wrote: I thought this is a solution... http://issues.apache

Re: Lucene 2.1 and Luke

2007-02-20 Thread Mark Miller
This method gets you a ways, but some stuff still does not seem to work. Document reconstruction for one, at least for me. Arthur Smith wrote: Luke works with Lucene 2.1, but you have to use the new lucene jar explicitly - http://www.getopt.org/luke/ |java -classpath luke.jar;lucene.jar org.g

Neuter fields in query

2007-02-20 Thread Mark Miller
I need to make a query specified with fields to be field care free when used against a memory index (every part of the query will match all fields). I am not holding my breath, but any ideas? Is the only way to deconstruct and reconstruct the query? - Mark

RE: Lucene 2.1 and Luke

2007-02-20 Thread Koji Sekiguchi
I thought this is a solution... http://issues.apache.org/jira/browse/SOLR-88 Koji > -Original Message- > From: Andrzej Bialecki [mailto:[EMAIL PROTECTED] > Sent: Tuesday, February 20, 2007 11:13 PM > To: java-user@lucene.apache.org > Subject: Re: Lucene 2.1 and Luke > > > Erick Erickson

Re: Lucene 2.1 and Luke

2007-02-20 Thread Arthur Smith
Luke works with Lucene 2.1, but you have to use the new lucene jar explicitly - http://www.getopt.org/luke/ |java -classpath luke.jar;lucene.jar org.getopt.luke.Luke |somebody will probably put together a 2.1 package soon. Arthur Erick Erickson wrote: Is there a Lucene 2.1 compatible v

Re: Lucene 2.1 and Luke

2007-02-20 Thread Andrzej Bialecki
Erick Erickson wrote: Is there a Lucene 2.1 compatible version of Luke hanging around that I've overlooked or do I need to compile it myself? The change notes for Lucene 2.1 indicate index format changes, and, sure enough, when I tried to open a new index with my current version of Luke, it do

Lucene 2.1 and Luke

2007-02-20 Thread Erick Erickson
Is there a Lucene 2.1 compatible version of Luke hanging around that I've overlooked or do I need to compile it myself? The change notes for Lucene 2.1 indicate index format changes, and, sure enough, when I tried to open a new index with my current version of Luke, it doesn't work (something abo

Re: ranking / scoring by field which contains a given rank?

2007-02-20 Thread Erick Erickson
I'm puzzled why you don't index a salesrank when you build your index and use Lucene's built-in sorting to sort them at query time. This probably means that I didn't read your e-mail carefully enough, but... If salesrank is something that you can pre-calculate and put in your index, this should fi

Re: Search in all fields

2007-02-20 Thread Erick Erickson
Yes, you've got it right. But this the classic trade-off, space for speed and what you choose to do reflects your particular situation. Doubling the index size isn't to be done lightly, but how big is your index? If its 500M, doubling it to 1G isn't a problem, Lucene handles this quite nicely. I'm

ranking / scoring by field which contains a given rank?

2007-02-20 Thread Dennis Berger
Hi List, is it possible to sort or rank by a specific field which contains only integer numbers? I have a few million products with a specific salesrank. If somebody searches "palm", he will get thousands of items, pocket adapter, pens everything else but not the most selled items. Which are no

Re: Search for a term in all fields

2007-02-20 Thread karl wettin
20 feb 2007 kl. 13.29 skrev Kainth, Sachin: How do I search for a term in all fields of a document? You create a boolean query with a term query for each field available in the document you are searching for.

Re: Using Lucene - Design Question

2007-02-20 Thread Otis Gospodnetic
Hi Shi, Nobody will be able to give you the precise answer, obviously. The best way is to try. You didn't say what response time is desirable nor what kind of hardware you will be using. I wouldn't bother with the Berkeley DB-backed Lucene index for now, just use the regular one (maybe use no

Re: 2.1 lock file name

2007-02-20 Thread Michael McCandless
On Tue, 20 Feb 2007 10:36:55 +0100, "jm" <[EMAIL PROTECTED]> said: > I updated my code to use 2.1 (IndexWriter deleting docs etc), and when > using native locks I still get a lock like this: > lucene-2361bf484af61abc81e6e7f412ad43af-n-write.lock > and when using SimpleFSLockFactory: > lucene-2361

Using Lucene - Design Question

2007-02-20 Thread shai deljo
Hi, I have no experience with Lucene and I'm trying to collect some information in order to determine what solution is best for me. I need to index ~50M documents (starting with 10M), the size of each document is ~2k-~5k and I'll index a couple of fields per document. I expect ~20 queries per seco

RE: Search in all fields

2007-02-20 Thread Kainth, Sachin
Hi Erick, I'm not sure I fully understand this. Would I be right in saying that with this solution I would still need to create an "all" field which contains duplicates of data already indexed. Because if this is so then we still have the problem of doubling the index size. -Original Messa

2.1 lock file name

2007-02-20 Thread jm
Hi, I updated my code to use 2.1 (IndexWriter deleting docs etc), and when using native locks I still get a lock like this: lucene-2361bf484af61abc81e6e7f412ad43af-n-write.lock and when using SimpleFSLockFactory: lucene-2361bf484af61abc81e6e7f412ad43af-write.lock From the changes.txt: 9. LUCEN