[jira] [Updated] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-04-29 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated TIKA-1285: -- Attachment: TIKA-1285.patch > Upgrade to PDFBox 2.0.0 when available > -

[jira] [Updated] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-04-29 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated TIKA-1285: -- Attachment: (was: TIKA-1285.patch) > Upgrade to PDFBox 2.0.0 when available > --

[jira] [Created] (TIKA-1286) Adding MS Visio VSDX to mime-types detection

2014-04-29 Thread Pascal Essiembre (JIRA)
Pascal Essiembre created TIKA-1286: -- Summary: Adding MS Visio VSDX to mime-types detection Key: TIKA-1286 URL: https://issues.apache.org/jira/browse/TIKA-1286 Project: Tika Issue Type: Impro

[jira] [Updated] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-04-29 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated TIKA-1285: -- Attachment: TIKA-1285.patch AdobeFontMetricParser (small change) PDF2XHTML (removal of xobject

[jira] [Comment Edited] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-04-29 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985028#comment-13985028 ] Jeremy Anderson edited comment on TIKA-1285 at 4/30/14 1:15 AM: -

[jira] [Updated] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-04-29 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated TIKA-1285: -- Description: This issue is to track fixes required when upgrading the PDFbox dependency to 2.0.

[jira] [Created] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2014-04-29 Thread Jeremy Anderson (JIRA)
Jeremy Anderson created TIKA-1285: - Summary: Upgrade to PDFBox 2.0.0 when available Key: TIKA-1285 URL: https://issues.apache.org/jira/browse/TIKA-1285 Project: Tika Issue Type: Improvement

[jira] [Comment Edited] (TIKA-1268) Extract images from PDF documents

2014-04-29 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984984#comment-13984984 ] Jeremy Anderson edited comment on TIKA-1268 at 4/29/14 11:59 PM:

[jira] [Commented] (TIKA-1268) Extract images from PDF documents

2014-04-29 Thread Jeremy Anderson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984984#comment-13984984 ] Jeremy Anderson commented on TIKA-1268: --- This fix will break when PDFBox 2.0.0 is rel

buildbot failure in ASF Buildbot on tika-trunk

2014-04-29 Thread buildbot
The Buildbot has detected a new failure on builder tika-trunk while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/tika-trunk/builds/2 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: portunus_ubuntu Build Reason: scheduler Build Source Stamp:

[jira] [Commented] (TIKA-1276) Missing embedded dependencies in tika-bundle

2014-04-29 Thread Oleg Tikhonov (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984630#comment-13984630 ] Oleg Tikhonov commented on TIKA-1276: - Rupert, Acceptable and valuable points, indeed.

[jira] [Commented] (TIKA-1276) Missing embedded dependencies in tika-bundle

2014-04-29 Thread Rupert Westenthaler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984580#comment-13984580 ] Rupert Westenthaler commented on TIKA-1276: --- [~olegt] for sure we could also embe

[jira] [Commented] (TIKA-1276) Missing embedded dependencies in tika-bundle

2014-04-29 Thread Oleg Tikhonov (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984393#comment-13984393 ] Oleg Tikhonov commented on TIKA-1276: - Applied the patch ... Could compile and build al

Re: [jira] [Commented] (TIKA-1276) Missing embedded dependencies in tika-bundle

2014-04-29 Thread Oleg Tikhonov
No problem. Will test it. On Tue, Apr 29, 2014 at 3:43 PM, Rupert Westenthaler (JIRA) wrote: > > [ > https://issues.apache.org/jira/browse/TIKA-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984251#comment-13984251] > > Rupert Westenthaler comm

[jira] [Commented] (TIKA-1284) TikaException for Microsoft Powerpoint Document [ ppt ]

2014-04-29 Thread Chetan Laddha (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984276#comment-13984276 ] Chetan Laddha commented on TIKA-1284: - Thanks Nick. I have reported in on POI : https:

[jira] [Commented] (TIKA-1284) TikaException for Microsoft Powerpoint Document [ ppt ]

2014-04-29 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984263#comment-13984263 ] Nick Burch commented on TIKA-1284: -- This will need to be reported upstream to the Apache P

[jira] [Updated] (TIKA-1284) TikaException for Microsoft Powerpoint Document [ ppt ]

2014-04-29 Thread Chetan Laddha (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Laddha updated TIKA-1284: Summary: TikaException for Microsoft Powerpoint Document [ ppt ] (was: Parser Issue) > TikaExcept

[jira] [Updated] (TIKA-1284) Parser Issue

2014-04-29 Thread Chetan Laddha (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Laddha updated TIKA-1284: Attachment: problem1.ppt This ppt is not getting extracted with "tika-app-1.5.jar" > Parser Issue >

[jira] [Updated] (TIKA-1284) Parser Issue

2014-04-29 Thread Chetan Laddha (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Laddha updated TIKA-1284: Attachment: Problem2.ppt This PPT is also not getting extracted with "tika-app-1.5.jar" > Parser Is

[jira] [Updated] (TIKA-1284) Parser Issue

2014-04-29 Thread Chetan Laddha (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Laddha updated TIKA-1284: Environment: (was: Windows7) > Parser Issue > > > Key: TIKA-1284 >

[jira] [Updated] (TIKA-1284) Parser Issue

2014-04-29 Thread Chetan Laddha (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Laddha updated TIKA-1284: Summary: Parser Issue (was: Fu) > Parser Issue > > > Key: TIKA-1284 >

[jira] [Created] (TIKA-1284) Fu

2014-04-29 Thread Chetan Laddha (JIRA)
Chetan Laddha created TIKA-1284: --- Summary: Fu Key: TIKA-1284 URL: https://issues.apache.org/jira/browse/TIKA-1284 Project: Tika Issue Type: Bug Components: parser Affects Versions

[jira] [Commented] (TIKA-1276) Missing embedded dependencies in tika-bundle

2014-04-29 Thread Rupert Westenthaler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984251#comment-13984251 ] Rupert Westenthaler commented on TIKA-1276: --- Personally I am happy with having Ti

[jira] [Updated] (TIKA-1276) Missing embedded dependencies in tika-bundle

2014-04-29 Thread Rupert Westenthaler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rupert Westenthaler updated TIKA-1276: -- Attachment: TIKA-1276_20140428_3_rwesten.diff > Now it complains on the org.apache.comm