RE: Analysis

Peter Kim Tue, 01 Nov 2005 08:33:27 -0800

Ok... just got confused because you mentioned XML. Unless you're
actually indexing the raw XML in some of your fields, the fact that
you're indexing XML documents as your source content is irrelevant to
your choice of Analyzer.


Choice of indexer really depends on your specific project requirements
and what level of querying functionality your client needs. For example,
I started off using the StandardAnalyzer because it incorporates some
very useful and sophisticated functionality. But I found that it was
removing many stop words that my client requested the ability to query
with, so I will end up using my own custom analyzer class primarily
based on the StandardAnalyzer but modifying the stop word list.



> -----Original Message-----
> From: Malcolm [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, November 01, 2005 11:19 AM
> To: java-user@lucene.apache.org
> Subject: Re: Analysis
> 
> Hi,
> I'm just asking for opinions on Analyzer's for the indexing. 
> For example Otis in his article uses the WhitespaceAnalyzer 
> and the Sandbox program uses the StandardAnalyzer.I am just 
> gauging opinions on the subject with regard to XML.
> I'm using a mix of the Sandbox XMLDocumentHandlerSAX and a 
> bit extra. I originally started using Digester but found that 
> I preferred the Sandbox implementation.
> Thanks,
> Malcolm Clark 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Analysis

Reply via email to