Trouble with Parsing a PDF of a Drawing

2020-02-25 Thread Kegan Huntley
Hello Developer, I'm having difficulties trying to parse a PDF that contains a design document of an object. The PDF I want to parse looks similar to the image below, the image below is an example I pulled from google images. I'm using a python system and I believe changing the "sortByPositio

[jira] [Commented] (TIKA-3039) Remove mvn dockerfile:build goal from tika-server

2020-02-25 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044961#comment-17044961 ] ASF GitHub Bot commented on TIKA-3039: -- tballison commented on issue #311: TIKA-3039

[jira] [Commented] (TIKA-3039) Remove mvn dockerfile:build goal from tika-server

2020-02-25 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044960#comment-17044960 ] ASF GitHub Bot commented on TIKA-3039: -- tballison commented on pull request #311: TIK

[jira] [Commented] (TIKA-3039) Remove mvn dockerfile:build goal from tika-server

2020-02-25 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044951#comment-17044951 ] ASF GitHub Bot commented on TIKA-3039: -- epugh commented on issue #311: TIKA-3039 Remo

[jira] [Commented] (TIKA-3036) broken build: "group id is too large" on a Mac

2020-02-25 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044883#comment-17044883 ] Hudson commented on TIKA-3036: -- SUCCESS: Integrated in Jenkins build tika-branch-1x #311 (See

[jira] [Commented] (TIKA-3038) Miredot license key expired

2020-02-25 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044882#comment-17044882 ] Hudson commented on TIKA-3038: -- SUCCESS: Integrated in Jenkins build tika-branch-1x #311 (See

[jira] [Commented] (TIKA-3036) broken build: "group id is too large" on a Mac

2020-02-25 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044763#comment-17044763 ] Hudson commented on TIKA-3036: -- SUCCESS: Integrated in Jenkins build Tika-trunk #1778 (See [

[jira] [Commented] (TIKA-3035) Tika-app --extract mode outputs to stderr instead of stdout

2020-02-25 Thread Kenneth William Krugler (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044722#comment-17044722 ] Kenneth William Krugler commented on TIKA-3035: --- FWIW,  [picocli|[https://pi

[jira] [Commented] (TIKA-3038) Miredot license key expired

2020-02-25 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044675#comment-17044675 ] Hudson commented on TIKA-3038: -- SUCCESS: Integrated in Jenkins build Tika-trunk #1777 (See [

[jira] [Commented] (TIKA-3038) Miredot license key expired

2020-02-25 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044593#comment-17044593 ] Tim Allison commented on TIKA-3038: --- W00t! Thank you! > Miredot license key expired >

[jira] [Comment Edited] (TIKA-3035) Tika-app --extract mode outputs to stderr instead of stdout

2020-02-25 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044573#comment-17044573 ] Tim Allison edited comment on TIKA-3035 at 2/25/20 3:29 PM: Al

[jira] [Comment Edited] (TIKA-3035) Tika-app --extract mode outputs to stderr instead of stdout

2020-02-25 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044573#comment-17044573 ] Tim Allison edited comment on TIKA-3035 at 2/25/20 3:29 PM: Al

[jira] [Commented] (TIKA-3035) Tika-app --extract mode outputs to stderr instead of stdout

2020-02-25 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044573#comment-17044573 ] Tim Allison commented on TIKA-3035: --- Alright, IIUC, I shouldn't have accepted the initia

[jira] [Commented] (TIKA-3038) Miredot license key expired

2020-02-25 Thread Tyler Bui-Palsulich (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044572#comment-17044572 ] Tyler Bui-Palsulich commented on TIKA-3038: --- I reached out to the Miredot team o

[jira] [Commented] (TIKA-3037) Tika Docs should highlight Tika-Server

2020-02-25 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044569#comment-17044569 ] Tim Allison commented on TIKA-3037: --- As with the getting started, yes, thank you, I'll t

[jira] [Commented] (TIKA-3037) Tika Docs should highlight Tika-Server

2020-02-25 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044566#comment-17044566 ] Tim Allison commented on TIKA-3037: --- I just moved /TIKA/TikaJAXRS to https://cwiki.apac

[jira] [Commented] (TIKA-3035) Tika-app --extract mode outputs to stderr instead of stdout

2020-02-25 Thread David Eric Pugh (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044533#comment-17044533 ] David Eric Pugh commented on TIKA-3035: --- Tried it with tika-app-1.23.jar and worked

[jira] [Commented] (TIKA-3035) Tika-app --extract mode outputs to stderr instead of stdout

2020-02-25 Thread David Eric Pugh (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044406#comment-17044406 ] David Eric Pugh commented on TIKA-3035: --- Here is my command: java -cp tika-app-1.23

pushing branch_1x to Apache snapshots?

2020-02-25 Thread Tim Allison
Hi All, What do we have to do to push the 1.x branch to Apache snapshots? Thank you! Cheers, Tim https://issues.apache.org/jira/browse/TIKA-3006?focusedCommentId=17044386&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-1704438

[jira] [Commented] (TIKA-3006) Regression in PDF keywords extraction since 1.23

2020-02-25 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044386#comment-17044386 ] Tim Allison commented on TIKA-3006: --- Yay! And it built for you! :P Turns out we do pus

[jira] [Comment Edited] (TIKA-3006) Regression in PDF keywords extraction since 1.23

2020-02-25 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044325#comment-17044325 ] Tim Allison edited comment on TIKA-3006 at 2/25/20 12:20 PM: -

[jira] [Comment Edited] (TIKA-3006) Regression in PDF keywords extraction since 1.23

2020-02-25 Thread David Pilato (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044358#comment-17044358 ] David Pilato edited comment on TIKA-3006 at 2/25/20 11:42 AM: --

[jira] [Comment Edited] (TIKA-3006) Regression in PDF keywords extraction since 1.23

2020-02-25 Thread David Pilato (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044358#comment-17044358 ] David Pilato edited comment on TIKA-3006 at 2/25/20 11:41 AM: --

[jira] [Commented] (TIKA-3006) Regression in PDF keywords extraction since 1.23

2020-02-25 Thread David Pilato (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044358#comment-17044358 ] David Pilato commented on TIKA-3006: All good for me. I noticed that new meta data ar

[jira] [Commented] (TIKA-3006) Regression in PDF keywords extraction since 1.23

2020-02-25 Thread David Pilato (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044349#comment-17044349 ] David Pilato commented on TIKA-3006: Well. It does not work as I'd love it seeing work

[jira] [Commented] (TIKA-3006) Regression in PDF keywords extraction since 1.23

2020-02-25 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044325#comment-17044325 ] Tim Allison commented on TIKA-3006: --- https://builds.apache.org/job/tika-branch-1x/310/or

[jira] [Commented] (TIKA-3047) Upgrade to POI 4.1.2

2020-02-25 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044226#comment-17044226 ] Hudson commented on TIKA-3047: -- SUCCESS: Integrated in Jenkins build tika-branch-1x #310 (See

[jira] [Commented] (TIKA-3050) Add xmp extraction to psd files

2020-02-25 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044225#comment-17044225 ] Hudson commented on TIKA-3050: -- SUCCESS: Integrated in Jenkins build tika-branch-1x #310 (See

[jira] [Commented] (TIKA-2952) Vulnerable "metadata-extractor 2.11.0" is present in tika 1.22.

2020-02-25 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044228#comment-17044228 ] Hudson commented on TIKA-2952: -- SUCCESS: Integrated in Jenkins build tika-branch-1x #310 (See

[jira] [Commented] (TIKA-3056) General upgrades for 1.24

2020-02-25 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044229#comment-17044229 ] Hudson commented on TIKA-3056: -- SUCCESS: Integrated in Jenkins build tika-branch-1x #310 (See

[jira] [Commented] (TIKA-3033) Upgrade to PDFBox 2.0.19 when available

2020-02-25 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044227#comment-17044227 ] Hudson commented on TIKA-3033: -- SUCCESS: Integrated in Jenkins build tika-branch-1x #310 (See

[jira] [Commented] (TIKA-3056) General upgrades for 1.24

2020-02-25 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044223#comment-17044223 ] Hudson commented on TIKA-3056: -- SUCCESS: Integrated in Jenkins build Tika-trunk #1776 (See [

[jira] [Commented] (TIKA-3047) Upgrade to POI 4.1.2

2020-02-25 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044220#comment-17044220 ] Hudson commented on TIKA-3047: -- SUCCESS: Integrated in Jenkins build Tika-trunk #1776 (See [

[jira] [Commented] (TIKA-3033) Upgrade to PDFBox 2.0.19 when available

2020-02-25 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044221#comment-17044221 ] Hudson commented on TIKA-3033: -- SUCCESS: Integrated in Jenkins build Tika-trunk #1776 (See [

[jira] [Commented] (TIKA-2952) Vulnerable "metadata-extractor 2.11.0" is present in tika 1.22.

2020-02-25 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044222#comment-17044222 ] Hudson commented on TIKA-2952: -- SUCCESS: Integrated in Jenkins build Tika-trunk #1776 (See [