Re: Problem when try to make a bench of indexing (a dictionary with 120.000 words)

2008-04-10 Thread Toke Eskildsen
On Thu, 2008-04-10 at 15:42 -0300, Leandro wrote: > machine1: Windows XP SP2, Celerom 2.66GHz e 256MB If that is a physical machine (as opposed to virtual), then the amount of RAM if not at all well balanced against the processor speed. > [...] java.lang.OutOfMemoryError: Java heap space How muc

Re: Use of Lucene for DB Search

2008-04-10 Thread Shalin Shekhar Mangar
Hi Prashant, It would help to take a look at DataImportHandler (in development) in Solr. Solr is built to be used in a web applications and DataImportHandler is built to be used to import data from databases. http://lucene.apache.org/solr/ http://wiki.apache.org/solr/DataImportHandler On Thu, Ap

about NullPointerException in DocumentsWriter$ThreadState.init(DocumentsWriter.java:751)

2008-04-10 Thread kai.hu
i got a problem yesterday, java.lang.NullPointerException at org.apache.lucene.index.DocumentsWriter$ThreadState.init(DocumentsWriter.java:751) at org.apache.lucene.index.DocumentsWriter.getThreadState(DocumentsWriter.java:2391) at org.apache.lucene.index.DocumentsWriter.updateDocument(Documen

RE: Use of Lucene for DB Search

2008-04-10 Thread John Griffin
Prashant, In addition to the other suggestions, take a look at Hibernate Search at http://www.hibernate.org/410.html. It is specifically designed for full text search in DBs. John G _ From: Prashant Saraf [mailto:[EMAIL PROTECTED] Sent: Thursday, April 10, 2008 7:57 AM To: j

Lucene index on relational data

2008-04-10 Thread Rajesh parab
Hi, We are using Lucene 2.0 to index data stored inside relational database. Like any relational database, our database has quite a few one-to-one and one-to-many relationships. For example, let’s say an Object A has one-to-many relationship with Object X and Object Y. As we need to de-normalize r

Re: How to improve performance of large numbers of successive searches?

2008-04-10 Thread Antony Bowesman
Chris McGee wrote: These tips have significantly improved the time to build the directory and search it. However, I have noticed that when I perform term queries using a searcher many times in rapid succession and iterate over all of the hits it can take a significant time. To perform 1000 te

Re: Problem when try to make a bench of indexing (a dictionary with 120.000 words)

2008-04-10 Thread Leandro
> > If tye 16M means you're only giving the process that much memory, it > surprises > me that it runs at all. Especially since you're putting it all in a > RAMdir. > Sorry that 16M is dictonarySizeInBytes() I would imagine that it is the same size of index... Well when I start to use a Dictonary

Re: Problem when try to make a bench of indexing (a dictionary with 120.000 words)

2008-04-10 Thread Erick Erickson
If tye 16M means you're only giving the process that much memory, it surprises me that it runs at all. Especially since you're putting it all in a RAMdir. Or is that 16M referring to something else? Best Erick On Thu, Apr 10, 2008 at 2:42 PM, Leandro <[EMAIL PROTECTED]> wrote: > Hello, > > *Sam

Re: How to improve performance of large numbers of successive searches?

2008-04-10 Thread Erick Erickson
>From this <<< iterate over all of the hits>>> I infer that you're using a Hits object. This is a no-no when getting more than 100 or so objects. In a nutshell, the query gets re-executed every 100 fetches. So your 2,000 hits are executing the query 20 times. The Hits object is optimized for retur

Problem when try to make a bench of indexing (a dictionary with 120.000 words)

2008-04-10 Thread Leandro
Hello, *Sample code:* SpellChecker spell; RAMDirectory dram = new RAMDirectory(); Dicionario dic = new Dicionario(); //one implementation of spell.Dictionary spell= new SpellChecker(dram); spell.indexDictionary(dic); //indexing... *Then I got the:* machine1: Windows XP SP2, Celerom 2.66GHz e 256M

How to improve performance of large numbers of successive searches?

2008-04-10 Thread Chris McGee
Hello, I am building fairly large directories (200-500 MB of disk space) using lucene-java. Sometimes it can take upwards of 10-15 mins to create the documents and write them to disk using my current configuration. I have upgraded to the latest 2.3.1 version and followed many of the recommenda

RE: Why Lucene has to rewrite queries prior to actual searching?

2008-04-10 Thread Chris Hostetter
: term from the query, this part of Lucene should be smarter to know how to : handle wildcards or even regex, so if "foo*" is received from the query, it : will start with retieving the TermInfo for just "foo", and then will : continue and add up more and more TermInfo structure to its cache (or :

Re: WildCardQuery and TooManyClauses

2008-04-10 Thread Joe K
Donna, so this doesn't work because search calls internaly MultiTermQuery.rewrite which causes TooManyClauses exception anyway even if the maxnumhits is set to 200 !! So I am lost again... Chose On Thu, Apr 10, 2008 at 3:02 PM, Donna L Gresh <[EMAIL PROTECTED]> wrote: > Doesn't the following d

Re: Use of Lucene for DB Search

2008-04-10 Thread Chris Lu
Without changing your existing code, you can use DBSight free version to create a Lucene index on your database data, and provide search on it. It'll take you less time to get it going than reading all the manual or marketing materials. -- Chris Lu - Instant Scalable Full

RE: Use of Lucene for DB Search

2008-04-10 Thread Prashant Saraf
Thanks I will check this out Thanks and Regards प्रशांत सराफ (Prashant Saraf) SE-II Cross Country Infotech Ext : 72543 www.crosscountry.in -Original Message- From: Erick Erickson [mailto:[EMAIL PROTECTED] Sent: Thursday, April 10, 2008 7:35 PM To: java-user@lucene.apache.org Subje

Re: Use of Lucene for DB Search

2008-04-10 Thread Leandro
> Hi, > > We are planning to provide search functionality in the a web > base application. Can we use Lucene for it to search data from database like > oracle and MS-Sql? > Yes, you can. > > > > > Thanks and Regards >प्रशांत सराफ > (Prashant Saraf) > SE-II > Cross Country Inf

Re: Use of Lucene for DB Search

2008-04-10 Thread Erick Erickson
As always, "It depends". What is it you really want to do? Lucene excels at searching text, NOT at doing RDBM sorts of operations. So if you have large amounts of text in columns in your DB, you can certainly put them in a Lucene index and search. But as soon as you find yourself trying to make Luc

Re: Use of Lucene for DB Search

2008-04-10 Thread Mathieu Lecarme
have a look at Compass. M. Prashant Saraf a écrit : Hi, We are planning to provide search functionality in the a web base application. Can we use Lucene for it to search data from database like oracle and MS-Sql? Thanks and Regards प्रशांत सराफ (Prashant Saraf) S

Use of Lucene for DB Search

2008-04-10 Thread Prashant Saraf
Hi, We are planning to provide search functionality in the a web base application. Can we use Lucene for it to search data from database like oracle and MS-Sql? Thanks and Regards प्रशांत सराफ (Prashant Saraf) SE-II Cross Country Infotech Ext : 72543 www.crosscountry.i

Re: WildCardQuery and TooManyClauses

2008-04-10 Thread Joe K
Hi Donna, thanks for the reply! I didn't try yet, but you are probably right that this should work for me. The filter parameter and the fact that TopDocs doesn't have the getter to the scoreDocs were confusing to me. Thanks a lot, Chose On Thu, Apr 10, 2008 at 3:02 PM, Donna L Gresh <[EMAIL PRO

Re: WildCardQuery and TooManyClauses

2008-04-10 Thread Donna L Gresh
Doesn't the following do what you want with maxnumhits =200? TopDocs td; td = indexSearcher.search(query, filter, maxnumhits); where filter can be null Donna L. Gresh Services Research, Mathematical Sciences Department IBM T.J. Watson Research Center (914) 945-247

WildCardQuery and TooManyClauses

2008-04-10 Thread Joe K
Hello everybody, I know there was written a tons of words about this issue, but I'm just not clear enough about it. I have these facts: 1. my query is always 1 letter and *, eg. M* 2. i always want to get max 200 results, no more! 3. i don't want to fix this issue by setting maxClauseCount I jus