subject:"Re\: indexing api wrt Analyzer"

Re: indexing api wrt Analyzer

2008-03-13 Thread John Wang

Excellent! Exactly what I was looking for! Thanks Grant! -John On Thu, Mar 13, 2008 at 5:39 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > There is an addDocument method that takes an Analyzer and overrides > the one used at construction of the IndexWriter. See > > http://lucene.apache.org/j

Re: indexing api wrt Analyzer

2008-03-13 Thread Grant Ingersoll

There is an addDocument method that takes an Analyzer and overrides the one used at construction of the IndexWriter. See http://lucene.apache.org/java/2_3_1/api/core/org/apache/lucene/index/IndexWriter.html#addDocument(org.apache.lucene.document.Document,%20org.apache.lucene.analysis.Analyzer)

Re: indexing api wrt Analyzer

2008-03-13 Thread John Wang

Hi Grant: For our corpus, we don't rely on idf in scoring calculation that much, so I don't see that being a problem that much. About performance, instantiating 1 indexWriter for a batch of say 1000 docs, e.g. iterate over 1000 docs and do addDocument; comparing with instantiating and clo

Re: indexing api wrt Analyzer

2008-03-13 Thread Grant Ingersoll

On Mar 13, 2008, at 11:03 AM, John Wang wrote: Yes, but usually it's a good idea to add documents in batch and not having to reinstantiate the writer for every document and then closing it. It would be nice if one can specify to the writer which analyzer to use. PerfieldAnalyzer wouldn't

Re: indexing api wrt Analyzer

2008-03-13 Thread Grant Ingersoll

On Mar 13, 2008, at 11:03 AM, John Wang wrote: Yes, but usually it's a good idea to add documents in batch and not having to reinstantiate the writer for every document and then closing it. Why does what I suggested require instantiating a new writer for every document? It uses the anal

Re: indexing api wrt Analyzer

2008-03-13 Thread John Wang

Yes, but usually it's a good idea to add documents in batch and not having to reinstantiate the writer for every document and then closing it. It would be nice if one can specify to the writer which analyzer to use. PerfieldAnalyzer wouldn't work because different analyzers may apply on the same

Re: indexing api wrt Analyzer

2008-03-13 Thread Grant Ingersoll

On IndexWriter, you can pass in the Analyzer when you add a Document, thus your application can identify the language, choose the analyzer for the given doc, and then add the document See public void addDocument(Document doc, Analyzer analyzer) On Mar 12, 2008, at 8:40 PM, John Wang wrote:

Re: indexing api wrt Analyzer

2008-03-12 Thread Daniel Noll

On Thursday 13 March 2008 15:21:19 Asgeir Frimannsson wrote: > >I was hoping to have IndexWriter take an AnalyzerFactory, where the > > AnalyzerFactory produces Analyzer depending on some criteria of the > > document, e.g. language. > With PerFieldAnalyzerWrapper, you can specify which analyze

Re: indexing api wrt Analyzer

2008-03-12 Thread Asgeir Frimannsson

On Thu, Mar 13, 2008 at 10:40 AM, John Wang <[EMAIL PROTECTED]> wrote: > Hi all: > >Maybe this has been asked before: > >I am building an index consists of multiple languages, (stored as a > field), and I have different analyzers depending on the language of the > language to be indexed. B

Re: indexing api wrt Analyzer

Re: indexing api wrt Analyzer

Re: indexing api wrt Analyzer

Re: indexing api wrt Analyzer

Re: indexing api wrt Analyzer

Re: indexing api wrt Analyzer

Re: indexing api wrt Analyzer

Re: indexing api wrt Analyzer

Re: indexing api wrt Analyzer

9 matches

Site Navigation

Mail list logo

Footer information