[
https://issues.apache.org/jira/browse/TIKA-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1048:
-
Attachment: TIKA-1048.patch
Patch w/ fix ... I added an optional boolean to TextContentHan
+1...
Cheers,
Chris
On 12/20/12 4:23 AM, "Michael McCandless"
wrote:
>Hi Oleg,
>
>UIMA could be useful for extracting text from XML (I'm not familiar
>enough with it...), but I think we should still fix Tika's own XML
>extraction.
>
>Mike McCandless
>
>http://blog.mikemccandless.com
>
>On Thu,
Hi Oleg,
UIMA could be useful for extracting text from XML (I'm not familiar
enough with it...), but I think we should still fix Tika's own XML
extraction.
Mike McCandless
http://blog.mikemccandless.com
On Thu, Dec 20, 2012 at 6:14 AM, Oleg Tikhonov wrote:
> Hi Make,
>
> May be consider using
Hi Make,
May be consider using of UIMA ("the rule engine") ?
BR,
Oleg
On Thu, Dec 20, 2012 at 1:05 PM, Michael McCandless (JIRA)
wrote:
>
> [
> https://issues.apache.org/jira/browse/TIKA-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]
>
> Michael McCandless update
[
https://issues.apache.org/jira/browse/TIKA-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1048:
-
Attachment: TIKA-1048.patch
Patch w/ failing test ... I'm not sure where/how to best fix t