Hi Tim, every now and then I play with the idea to provide an EMF parser like the WMF parser, to render images inside slideshows. This could be of course used to extract other content too. The simplest way would be, to adapt the FreeHep library, but its GPL licensed ... :(
So for extracting embedded content, I guess it's not so difficult to generically parse the emf(+) records and only handle the interesting ones. This limited functionality should be in scratchpad or the example classes. If it is not a huge code chunk, it could be in the Extractor class - otherwise I would like to see it in Tika ... Andi --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
