[jira] [Commented] (TIKA-2208) Catch missing libraires

David Pilato (JIRA) Sun, 18 Dec 2016 06:17:32 -0800

    [ 
https://issues.apache.org/jira/browse/TIKA-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15758897#comment-15758897
 ]


David Pilato commented on TIKA-2208:
------------------------------------

Adding missing libs

{code}
  compile "com.github.virtuald:curvesapi:1.04"
  compile "com.bbn.poi.visio:ooxml-visio-schemas:2011.1"
{code}


This is now causing a JAR Hell issue. Same class available in 2 JARs:

{code}
Caused by: java.lang.IllegalStateException: jar hell!
class: com.microsoft.schemas.office.visio.x2012.main.CellType$Factory
jar1: 
/Users/dpilato/.gradle/caches/modules-2/files-2.1/org.apache.poi/poi-ooxml-schemas/3.15/de4a50ca39de48a19606b35644ecadb2f733c479/poi-ooxml-schemas-3.15.jar
jar2: 
/Users/dpilato/.gradle/caches/modules-2/files-2.1/com.bbn.poi.visio/ooxml-visio-schemas/2011.1/5c395aefc5c1a33f517c243843c909c1f4d6b3f0/ooxml-visio-schemas-2011.1.jar
{code}


> Catch missing libraires
> -----------------------
>
>                 Key: TIKA-2208
>                 URL: https://issues.apache.org/jira/browse/TIKA-2208
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>            Reporter: David Pilato
>
> Hi there
> We have decided to remove support for some formats when using Tika to extract 
> text and metadata.
> We defined our list of Parsers:
> {code:java}
>     private static final Parser PARSERS[] = new Parser[] {
>         // documents
>         new org.apache.tika.parser.html.HtmlParser(),
>         new org.apache.tika.parser.rtf.RTFParser(),
>         new org.apache.tika.parser.pdf.PDFParser(),
>         new org.apache.tika.parser.txt.TXTParser(),
>         new org.apache.tika.parser.microsoft.OfficeParser(),
>         new org.apache.tika.parser.microsoft.OldExcelParser(),
>         new org.apache.tika.parser.microsoft.ooxml.OOXMLParser(),
>         new org.apache.tika.parser.odf.OpenDocumentParser(),
>         new org.apache.tika.parser.iwork.IWorkPackageParser(),
>         new org.apache.tika.parser.xml.DcXMLParser(),
>         new org.apache.tika.parser.epub.EpubParser(),
>     };
>     private static final AutoDetectParser PARSER_INSTANCE = new 
> AutoDetectParser(PARSERS);
>     private static final Tika TIKA_INSTANCE = new 
> Tika(PARSER_INSTANCE.getDetector(), PARSER_INSTANCE);
> {code}
> But when a MS Office Word document embeds another non supported document 
> (Like a Visio Schema) an {{NoClassDefFoundError}} is raised.
> Would it be possible to catch such a case and throw in that case a 
> {{TikaException}} so it behaves as an Exception and not as a Throwable?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TIKA-2208) Catch missing libraires

Reply via email to