Re: Range queries in Lucene - numerical or lexicographical

2007-08-12 Thread Chris Hostetter
: Subject: Re: Range queries in Lucene - numerical or lexicographical : : Thanks. Probably this should be mentioned on the documentation page. it does say right above the "date" example: " Sorting is done lexicographically." (Admitedly I'm not sure why the word "Sorting" is used in that sen

Index file size limitation of 2GB

2007-08-12 Thread rohit saini
Hi all, I have bulk of data to be indexed and that may cross index file size of 2GB. As lucene faq tells that if index file size increses to 2GB there will be problems. but faq tells to make index subdirectory in this case. I have tried to do so made a index subdirectory in index main directory wh

Re: Range queries in Lucene - numerical or lexicographical

2007-08-12 Thread Mohammad Norouzi
Thanks Erick but unfortunately NumberTools works only with long primitive type I am wondering why you didn't put some method for double and float. On 8/13/07, Nilesh Bansal <[EMAIL PROTECTED]> wrote: > > Thanks. Probably this should be mentioned on the documentation page. > > -Nilesh > > On 8/12

Re: Range queries in Lucene - numerical or lexicographical

2007-08-12 Thread Nilesh Bansal
Thanks. Probably this should be mentioned on the documentation page. -Nilesh On 8/12/07, Erick Erickson <[EMAIL PROTECTED]> wrote: > As has been discussed several times, Lucene is a string-only engine, and > has no native understanding of numerical values. You have to normalize > them for string

performance on filtering against thousands of different publications

2007-08-12 Thread Cedric Ho
Hi all, My problem is as follows: Our documents each comes from a different publication. And we currently have > 5000 different publication sources. Our clients can choose arbitrarily a subset of the publications while performing search. It is not uncommon that a search will have to match hundr

Nested concept fields

2007-08-12 Thread Jeff French
I'm trying to index concepts within a document, and search them within the context of a multivalued field. I'm not even sure it's possible with the QueryParser or QsolParser syntax. Does anyone know if it is / is not possible? If not, is it conceptually possible using the Query API? What I'd like

RE: High CPU usage duing index and search

2007-08-12 Thread Chew Yee Chuang
Hi testn, I have tested Filter, it is pretty fast, but still take a lot of CPU resource, Maybe it could due to the number of filter I run. Thank you eChuang, Chew -Original Message- From: testn [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 07, 2007 10:37 PM To: java-user@lucene.apac

Re: Indexing correctly?

2007-08-12 Thread Erick Erickson
Where are your source files and index? If they're somewhere out there on the network, you may be having some slowdown because of network latency (the part about "/mount/." leads me to ask this one). If this is the case, you might get an improvement if all the files are local... Best Erick On

Re: Range queries in Lucene - numerical or lexicographical

2007-08-12 Thread Erick Erickson
As has been discussed several times, Lucene is a string-only engine, and has no native understanding of numerical values. You have to normalize them for string searches. See NumberTools. Best Erick On 8/11/07, Nilesh Bansal <[EMAIL PROTECTED]> wrote: > > Hi all, > > Lucene query parser synax page

Re: Amount of RAM needed to support a growing lucene index?

2007-08-12 Thread eks dev
300k documents is something I would consider very small. Anything under 10Mio documents IMHO is small for Lucene (meaning, commodity hardware, 1G RAM should give you well under second response times). The number of words is not all that important, much more important would be the number of uniqu

Re: Amount of RAM needed to support a growing lucene index?

2007-08-12 Thread karl wettin
12 aug 2007 kl. 14.01 skrev lucene user: Do you know if 290k articles and 234 million words is a large lucene index or a medium one? Do people build them this big all the time? If the calculator in my head works you have 300k documents at 4k text each. I say your corpus is borderline sm

Re: Amount of RAM needed to support a growing lucene index?

2007-08-12 Thread lucene user
Thanks, Karl. Do you know if 290k articles and 234 million words is a large lucene index or a medium one? Do people build them this big all the time? Thanks! On 8/12/07, karl wettin <[EMAIL PROTECTED]> wrote: > > > 12 aug 2007 kl. 09.03 skrev lucene user: > > > If I have an index with 111k artic

Re: Amount of RAM needed to support a growing lucene index?

2007-08-12 Thread karl wettin
12 aug 2007 kl. 09.03 skrev lucene user: If I have an index with 111k articles and 90 million words indexed, how much RAM should I have to get really fast access speeds? If I have an index with 290k articles and 234 million words indexed, how much RAM should I have to get really fast acce

Amount of RAM needed to support a growing lucene index?

2007-08-12 Thread lucene user
Hi, Folks - Two quick questions - need to size a server to run our new index. If I have an index with 111k articles and 90 million words indexed, how much RAM should I have to get really fast access speeds? If I have an index with 290k articles and 234 million words indexed, how much RAM should