Best implementation for address searching

2010-10-19 Thread Jasper de Barbanson
I'm currently working on building a Geocoder. The purpose of a Geocoder is to find the coordinates belonging to any given input address. I have a rather simple version based on Lucene working, however I have a feeling it can be a lot better. Also new functionality will be added, which is difficult

Re: how to index large number of files?

2010-10-19 Thread Johnbin Wang
You can start a fixedThreadPool to index all these files in the multhread way. Every thread execute an index task which could index a part of all the files. In the index task, when indexing 1 files, you need execute the indexWrite.commit() method to flush all the index add operation to disk fil

Re: how to index large number of files?

2010-10-19 Thread Sahin Buyrukbilen
Thank you Johnbin, do you know which parameter I have to play with? On Wed, Oct 20, 2010 at 12:59 AM, Johnbin Wang wrote: > I think you can write index file once every 10,000 files or less have been > read. > > On Wed, Oct 20, 2010 at 12:11 PM, Sahin Buyrukbilen < > sahin.buyrukbi...@gmail.com> w

Re: how to index large number of files?

2010-10-19 Thread Johnbin Wang
I think you can write index file once every 10,000 files or less have been read. On Wed, Oct 20, 2010 at 12:11 PM, Sahin Buyrukbilen < sahin.buyrukbi...@gmail.com> wrote: > Hi all, > > I have to index about 4.5Million txt files. When I run the my indexing > application through Eclipse, I get this

how to index large number of files?

2010-10-19 Thread Sahin Buyrukbilen
Hi all, I have to index about 4.5Million txt files. When I run the my indexing application through Eclipse, I get this error : "Exception in thread "main" java.lang.OutOfMemoryError: Java heap space" eclipse -vmargs -Xmx2000m -Xss8192k eclipse -vmargs -Xms40M -Xmx2G I tried running Eclipse wit

using ParallelReader to update a document

2010-10-19 Thread Nilesh Vijaywargiay
I am trying to find a work around for updating fields and in turn the documents in the original index. I am using parallel reader and providing it two index, the second index being the first to be seen by parallel reader. The second index has same number of documents as first index[in this case,

Re: combining MultiFieldQueryParserparser with FuzzyQuery

2010-10-19 Thread Ian Lea
I don't think you can do that directly with MultiFieldQueryParser, but as Erick said in a similar thread a short while ago "You can create your own BooleanQuery and just add clauses as you need to". FuzzyQuery fq1 = new FuzzyQuery(whatever ...); FuzzyQuery fq2 = new FuzzyQuery(whatever-else ...);