+1, great solution, Jukka! Cheers, Chris
On Jun 21, 2012, at 8:08 AM, Jukka Zitting wrote: > Hi, > > On Thu, Jun 21, 2012 at 4:35 AM, 122jxgcn <ywpar...@gmail.com> wrote: >> Hi, I'm currently working on Tika to properly process custom file type (*.hwp >> file) I have a binary executable file which converts hwp file into xml file. >> I'm not sure how can I include this binary file so that when Tika encounters >> hwp file, it can automatically convert in to xml file using the binary, and >> pass the document to XMLParser. > > The best approach would be for you to write a custom Parser class for > this file type. That class would call your executable to convert the > file to XML and would then invoke the standard XMLParser on the > result. > > BR, > > Jukka Zitting ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++