subject:"RE\: Indexing XML document"

Re: Indexing XML document

2007-12-11 Thread Otis Gospodnetic

Liaqat, Out of curiosity - what are you using to analyze and index Urdu? AraMorph or something else? Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Liaqat Ali <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Tuesday, Decembe

RE: Indexing XML document

2007-12-05 Thread Seneviratne_Yasoja

The example from Grant's earlier reply uses UTF-8: http://wiki.apache.org/lucene-java/IndexingOtherLanguages I tried out the Urdu in your email, first converted it to UTF-8, then Lucene seemed to index/search ok, SAX worked as well for parsing it. -Original Message- From: Liaqat Ali [ma

Re: Indexing XML document

2007-12-04 Thread Grant Ingersoll

You are on the right path, just extract your content using SAX and then you can add Fields to Lucene for each document. As long as the values are strings, it should be the same as any indexing task. The key of course will be using an Analyzer that understands how to tokenize/stem Urdu.

Re: Indexing XML document

RE: Indexing XML document

Re: Indexing XML document

3 matches

Site Navigation

Mail list logo

Footer information