David Pilato created TIKA-2579:
--
Summary: Update to PDFBox 2.0.9
Key: TIKA-2579
URL: https://issues.apache.org/jira/browse/TIKA-2579
Project: Tika
Issue Type: Improvement
Components: p
[
https://issues.apache.org/jira/browse/TIKA-2579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-2579:
--
Summary: Update to PDFBox 2.0.9 when available (was: Update to PDFBox
2.0.9)
> Update to PDFBox 2.0.9 w
[
https://issues.apache.org/jira/browse/TIKA-2579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371527#comment-16371527
]
Tim Allison commented on TIKA-2579:
---
+1 Thank you for opening this. I've been away from
[
https://issues.apache.org/jira/browse/TIKA-2579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison reassigned TIKA-2579:
-
Assignee: Tim Allison
> Update to PDFBox 2.0.9 when available
> --
Hello Apache Supporters and Enthusiasts
This is your FINAL reminder that the Call for Papers (CFP) for the
Apache EU Roadshow is closing soon. Our Apache EU Roadshow will focus on
Cloud, IoT, Apache Tomcat, Apache Http and will run from 13-14 June 2018
in Berlin.
Note that the CFP deadline has
[
https://issues.apache.org/jira/browse/TIKA-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371955#comment-16371955
]
ASF GitHub Bot commented on TIKA-2580:
--
ewanmellor opened a new pull request #220: Fix
Ewan Mellor created TIKA-2581:
-
Summary: testOCROutputsHOCR fails with Tesseract 4.0
Key: TIKA-2581
URL: https://issues.apache.org/jira/browse/TIKA-2581
Project: Tika
Issue Type: Bug
Co
[
https://issues.apache.org/jira/browse/TIKA-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ewan Mellor updated TIKA-2581:
--
Description:
TesseractOCRParserTest.testOCROutputsHOCR fails with Tesseract 4.0.
With 3.x, the output is
[
https://issues.apache.org/jira/browse/TIKA-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371946#comment-16371946
]
ASF GitHub Bot commented on TIKA-2570:
--
ewanmellor opened a new pull request #219: Fix
[
https://issues.apache.org/jira/browse/TIKA-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371999#comment-16371999
]
ASF GitHub Bot commented on TIKA-2581:
--
ewanmellor opened a new pull request #221: Fix
Ewan Mellor created TIKA-2582:
-
Summary: Tesseract 4.0 includes a FF character by default,
breaking parsers
Key: TIKA-2582
URL: https://issues.apache.org/jira/browse/TIKA-2582
Project: Tika
Issu
[
https://issues.apache.org/jira/browse/TIKA-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ewan Mellor updated TIKA-2582:
--
Description:
Tesseract 4.0 includes a change to use form feed characters to separate pages
by default in
Ewan Mellor created TIKA-2580:
-
Summary: SafeContentHandler documentation is incorrect about
replacement character
Key: TIKA-2580
URL: https://issues.apache.org/jira/browse/TIKA-2580
Project: Tika
Ewan Mellor created TIKA-2583:
-
Summary: Tika readme should mention builds.apache.org
Key: TIKA-2583
URL: https://issues.apache.org/jira/browse/TIKA-2583
Project: Tika
Issue Type: Bug
C
[
https://issues.apache.org/jira/browse/TIKA-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372029#comment-16372029
]
ASF GitHub Bot commented on TIKA-2582:
--
ewanmellor opened a new pull request #222: Fix
[
https://issues.apache.org/jira/browse/TIKA-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372031#comment-16372031
]
ASF GitHub Bot commented on TIKA-2583:
--
ewanmellor opened a new pull request #223: Fix
Ewan Mellor created TIKA-2584:
-
Summary: Tika should have a way to pass arbitrary Tesseract options
Key: TIKA-2584
URL: https://issues.apache.org/jira/browse/TIKA-2584
Project: Tika
Issue Type: I
[
https://issues.apache.org/jira/browse/TIKA-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372048#comment-16372048
]
ASF GitHub Bot commented on TIKA-2581:
--
Gagravarr commented on issue #221: Fix for TIK
[
https://issues.apache.org/jira/browse/TIKA-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372052#comment-16372052
]
Nick Burch commented on TIKA-2583:
--
ASF policy is that "users" should only be directed to
[
https://issues.apache.org/jira/browse/TIKA-2563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-2563.
---
Resolution: Fixed
Assignee: Tim Allison
Fix Version/s: 2.0.0
1.18
>
Nick Burch created TIKA-2585:
Summary: TikaInputStream support for resetting via a factory of
InputStreams
Key: TIKA-2585
URL: https://issues.apache.org/jira/browse/TIKA-2585
Project: Tika
Issue
[
https://issues.apache.org/jira/browse/TIKA-2585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Burch updated TIKA-2585:
-
Description:
As raised in the 2.0 breaking changes thread, currently the only way that Tika
has of handlin
[
https://issues.apache.org/jira/browse/TIKA-2585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372063#comment-16372063
]
Nick Burch commented on TIKA-2585:
--
I can't immediately see a common / well known class/in
[
https://issues.apache.org/jira/browse/TIKA-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372064#comment-16372064
]
ASF GitHub Bot commented on TIKA-2570:
--
tballison closed pull request #219: Fix for TI
[
https://issues.apache.org/jira/browse/TIKA-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-2570.
---
Resolution: Fixed
Fix Version/s: 2.0.0
1.18
> Tika 1.17 uses vulnerable Jacks
[
https://issues.apache.org/jira/browse/TIKA-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372074#comment-16372074
]
Ewan Mellor commented on TIKA-2583:
---
I wasn't trying to tell users where to find builds,
[
https://issues.apache.org/jira/browse/TIKA-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372075#comment-16372075
]
ASF GitHub Bot commented on TIKA-2581:
--
ewanmellor commented on issue #221: Fix for TI
[
https://issues.apache.org/jira/browse/TIKA-2563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372077#comment-16372077
]
Hudson commented on TIKA-2563:
--
SUCCESS: Integrated in Jenkins build Tika-trunk #1436 (See
[h
[
https://issues.apache.org/jira/browse/TIKA-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372078#comment-16372078
]
ASF GitHub Bot commented on TIKA-2584:
--
ewanmellor opened a new pull request #224: Fix
Ewan Mellor created TIKA-2586:
-
Summary: PDFParser documentation has incorrect DPI default
Key: TIKA-2586
URL: https://issues.apache.org/jira/browse/TIKA-2586
Project: Tika
Issue Type: Improvemen
[
https://issues.apache.org/jira/browse/TIKA-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372152#comment-16372152
]
Hudson commented on TIKA-2570:
--
SUCCESS: Integrated in Jenkins build Tika-trunk #1437 (See
[h
[
https://issues.apache.org/jira/browse/TIKA-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-2586.
---
Resolution: Fixed
Thank you!
> PDFParser documentation has incorrect DPI default
> ---
32 matches
Mail list logo