Not sure I am understanding your question correctly, but I think you
want to pick your Analyzer based on what is in your content (i.e.
language, usage of special symbols, etc.), not based on what the format
of your content is (i.e. XML).
Malcolm wrote:
Hi,
I'm just asking for opinions on Analyzer's for the indexing. For
example Otis in his article uses the WhitespaceAnalyzer and the
Sandbox program uses the StandardAnalyzer.I am just gauging opinions
on the subject with regard to XML.
I'm using a mix of the Sandbox XMLDocumentHandlerSAX and a bit extra.
I originally started using Digester but found that I preferred the
Sandbox implementation.
Thanks,
Malcolm Clark
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--
-------------------------------------------------------------------
Grant Ingersoll
Sr. Software Engineer
Center for Natural Language Processing
Syracuse University
School of Information Studies
337 Hinds Hall
Syracuse, NY 13244
http://www.cnlp.org
Voice: 315-443-5484
Fax: 315-443-6886
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]