You don't need a Lucene Parser (They don't exist). In stead use a Java
Parser (such as dom4j). I personally prefer DOM. It allows XPATH to extract
exactly what you need. SAX is an alternative to DOM. SAX isn't however
recommended by the W3C and lacks many of the extraction methods available
in DOM.
Hi Karthik,
​Sounds like you know what you have to do, the only problem I saw with your
statement is about parsing it with Lucene. You can read the files from
disk (basic I/O) and use a SAX parser to extract the information you want
to search against and then build your index from that informati
Hello,
I have a list of xml files in a directory , I have to parse these xml using
apache lucene and index it. Once indexing is done , I want to be able to
search text inside xml files. How can I achieve this? I am able to search
text files in a similar way, can someone help me with xml lucene sea