Excellent!
Exactly what I was looking for!
Thanks Grant!
-John
On Thu, Mar 13, 2008 at 5:39 PM, Grant Ingersoll <[EMAIL PROTECTED]>
wrote:
> There is an addDocument method that takes an Analyzer and overrides
> the one used at construction of the IndexWriter. See
>
> http://lucene.apache.org/j
There is an addDocument method that takes an Analyzer and overrides
the one used at construction of the IndexWriter. See
http://lucene.apache.org/java/2_3_1/api/core/org/apache/lucene/index/IndexWriter.html#addDocument(org.apache.lucene.document.Document,%20org.apache.lucene.analysis.Analyzer)
Hi Grant:
For our corpus, we don't rely on idf in scoring calculation that much,
so I don't see that being a problem that much.
About performance, instantiating 1 indexWriter for a batch of say 1000
docs, e.g. iterate over 1000 docs and do addDocument; comparing with
instantiating and clo
On Mar 13, 2008, at 11:03 AM, John Wang wrote:
Yes, but usually it's a good idea to add documents in batch and not
having
to reinstantiate the writer for every document and then closing it.
It would be nice if one can specify to the writer which analyzer to
use.
PerfieldAnalyzer wouldn't
On Mar 13, 2008, at 11:03 AM, John Wang wrote:
Yes, but usually it's a good idea to add documents in batch and not
having
to reinstantiate the writer for every document and then closing it.
Why does what I suggested require instantiating a new writer for every
document? It uses the anal
Yes, but usually it's a good idea to add documents in batch and not having
to reinstantiate the writer for every document and then closing it.
It would be nice if one can specify to the writer which analyzer to use.
PerfieldAnalyzer wouldn't work because different analyzers may apply on the
same
On IndexWriter, you can pass in the Analyzer when you add a Document,
thus your application can identify the language, choose the analyzer
for the given doc, and then add the document
See
public void addDocument(Document doc, Analyzer analyzer)
On Mar 12, 2008, at 8:40 PM, John Wang wrote:
On Thursday 13 March 2008 15:21:19 Asgeir Frimannsson wrote:
> >I was hoping to have IndexWriter take an AnalyzerFactory, where the
> > AnalyzerFactory produces Analyzer depending on some criteria of the
> > document, e.g. language.
> With PerFieldAnalyzerWrapper, you can specify which analyze
On Thu, Mar 13, 2008 at 10:40 AM, John Wang <[EMAIL PROTECTED]> wrote:
> Hi all:
>
>Maybe this has been asked before:
>
>I am building an index consists of multiple languages, (stored as a
> field), and I have different analyzers depending on the language of the
> language to be indexed. B