Re: [VOTE] Release Apache Tika 2.2.1 Candidate #3

2021-12-20 Thread Oleg Tikhonov
Hi, [x] +1 Release this package as Apache Tika 2.2.1 mvn clean install -U *OK* OS and arch: Linux oleg-vb 5.11.0-41-generic #45~20.04.1-Ubuntu SMP Wed Nov 10 10:20:10 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux Java version: openjdk version "1.8.0_312" OpenJDK Runtime Environment (build 1.8.0_312-8

Re: [VOTE] Release Apache Tika 2.2.1 Candidate #3

2021-12-20 Thread Dave Meikle
On Mon, 20 Dec 2021 at 15:59, Tim Allison wrote: > A candidate for the Tika 2.2.1 release is available at: > https://dist.apache.org/repos/dist/dev/tika/2.2.1 > > The release candidate is a zip archive of the sources in: > https://github.com/apache/tika/tree/2.2.1-rc3/ > > The SHA-512 checksum of

Re: [VOTE] Release Apache Tika 1.28 Candidate #3

2021-12-20 Thread David Meikle
On Mon, 20 Dec 2021 at 16:31, Tim Allison wrote: > > The SHA-512 checksum of the archive is > > f8487f58aeec011c993ac46d8e99f8bed64333ccfa57edf8ff9773653204fa2a4e27cb1102e53c181ae7a1e98f892da4c1766f473ce5ee83c1b9229c4f8e5aec. > > In addition, a staged maven repository is available here: > > https

[jira] [Commented] (TIKA-3629) Keywords are not extracted anymore from PDF documents

2021-12-20 Thread David Pilato (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462833#comment-17462833 ] David Pilato commented on TIKA-3629: I'm not sure I got it.    Is {{Office.KEYWORDS}

[jira] [Commented] (TIKA-3630) Weird extra tab in parsing charts in xlsx depending on construction of InputStream

2021-12-20 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462816#comment-17462816 ] Tim Allison commented on TIKA-3630: --- In {{OOXMLWordAndPowerPointTextHandler}} in the {{e

[jira] [Updated] (TIKA-3630) Weird extra tab in parsing charts in xlsx depending on construction of InputStream

2021-12-20 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-3630: -- Description: When I was respinning 2.2.1-rc3 this morning, I noticed that we were getting slightly diff

[jira] [Created] (TIKA-3630) Weird extra tab in parsing charts in xlsx depending on construction of InputStream

2021-12-20 Thread Tim Allison (Jira)
Tim Allison created TIKA-3630: - Summary: Weird extra tab in parsing charts in xlsx depending on construction of InputStream Key: TIKA-3630 URL: https://issues.apache.org/jira/browse/TIKA-3630 Project: Tik

[jira] [Resolved] (TIKA-3627) OOXML parsing is not working as intended using multiple threads

2021-12-20 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-3627. --- Fix Version/s: 2.2.1 Resolution: Fixed This was caused by my error when I downgraded POI from 5

Re: [VOTE] Release Apache Tika 2.2.1 Candidate #3

2021-12-20 Thread Konstantin Gribov
Built successfully on ArchLinux, OpenJDK 11 & 17 (Temurin-11.0.13+8 & 17.0.1+12) w/ Tesseract 4.1.1, Leptonica 1.82. Same issue as with 2.2.1 rc2 with tesseract multipage test. GPG signatures and SHA512 hashes are fine. [x] +1 Release this package as Apache Tika 2.2.1 [ ] -1 Do not release this p

Re: [VOTE] Release Apache Tika 1.28 Candidate #3

2021-12-20 Thread Konstantin Gribov
Built successfully on ArchLinux, OpenJDK 11 & 17 (Temurin-11.0.13+8 & 17.0.1+12) w/ Tesseract 4.1.1, Leptonica 1.82. Same issue as with 2.2.1 rc2 with tesseract multipage test. GPG signatures and SHA512 hashes are fine. [x] +1 Release this package as Apache Tika 1.28 [ ] -1 Do not release this pa

Re: [VOTE] Release Apache Tika 1.28 Candidate #3

2021-12-20 Thread Tilman Hausherr
+1 Tilman Am 20.12.2021 um 17:31 schrieb Tim Allison: A candidate for the Tika 1.28 release is available at: https://dist.apache.org/repos/dist/dev/tika/1.28 The release candidate is a zip archive of the sources in: https://github.com/apache/tika/tree/1.28-rc3/ The SHA-512 checksum of t

Re: [VOTE] Release Apache Tika 2.2.1 Candidate #3

2021-12-20 Thread Tilman Hausherr
+1 Tilman Am 20.12.2021 um 16:59 schrieb Tim Allison: A candidate for the Tika 2.2.1 release is available at: https://dist.apache.org/repos/dist/dev/tika/2.2.1 The release candidate is a zip archive of the sources in: https://github.com/apache/tika/tree/2.2.1-rc3/ The SHA-512 checksum of the

[jira] [Commented] (TIKA-3627) OOXML parsing is not working as intended using multiple threads

2021-12-20 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462731#comment-17462731 ] Hudson commented on TIKA-3627: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk8 #4

[VOTE] Release Apache Tika 1.28 Candidate #3

2021-12-20 Thread Tim Allison
A candidate for the Tika 1.28 release is available at: https://dist.apache.org/repos/dist/dev/tika/1.28 The release candidate is a zip archive of the sources in: https://github.com/apache/tika/tree/1.28-rc3/ The SHA-512 checksum of the archive is f8487f58aeec011c993ac46d8e99f8bed64333ccfa5

[VOTE] Release Apache Tika 2.2.1 Candidate #3

2021-12-20 Thread Tim Allison
A candidate for the Tika 2.2.1 release is available at: https://dist.apache.org/repos/dist/dev/tika/2.2.1 The release candidate is a zip archive of the sources in: https://github.com/apache/tika/tree/2.2.1-rc3/ The SHA-512 checksum of the archive is 42accd01d5f152a9a6b26883b735242fb6e9eb01f85f3ff

[CANCELED][VOTE] Release Apache Tika 2.2.1 Candidate #2

2021-12-20 Thread Tim Allison
See below. Respinning 2.2.1-rc3 now. On Mon, Dec 20, 2021 at 8:38 AM Tim Allison wrote: > > TIKA-3627 is bad. > > I'm now -1. I'll respin the 2.x shortly. > > Apologies for the release fatigue. > > On Sun, Dec 19, 2021 at 11:22 PM Tilman Hausherr > wrote: > > > > +1 > > > > Successful build o

[jira] [Commented] (TIKA-3627) OOXML parsing is not working as intended using multiple threads

2021-12-20 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462627#comment-17462627 ] Hudson commented on TIKA-3627: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk8 #4

[jira] [Commented] (TIKA-3627) OOXML parsing is not working as intended using multiple threads

2021-12-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462617#comment-17462617 ] ASF GitHub Bot commented on TIKA-3627: -- tballison commented on pull request #467: URL

[GitHub] [tika] tballison commented on pull request #467: [TIKA-3627] Fix OOXML parsing for multiple threads

2021-12-20 Thread GitBox
tballison commented on pull request #467: URL: https://github.com/apache/tika/pull/467#issuecomment-997933281 This is critical. I'm going to respin 2.2.1. Thank you for finding this! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [VOTE] Release Apache Tika 2.2.1 Candidate #2

2021-12-20 Thread Tim Allison
TIKA-3627 is bad. I'm now -1. I'll respin the 2.x shortly. Apologies for the release fatigue. On Sun, Dec 19, 2021 at 11:22 PM Tilman Hausherr wrote: > > +1 > > Successful build on german W10 > > Tilman > > Am 19.12.2021 um 19:53 schrieb Tim Allison: > > A candidate for the Tika 2.2.1 release

[jira] [Comment Edited] (TIKA-3629) Keywords are not extracted anymore from PDF documents

2021-12-20 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462615#comment-17462615 ] Tim Allison edited comment on TIKA-3629 at 12/20/21, 1:37 PM: --

[jira] [Commented] (TIKA-3629) Keywords are not extracted anymore from PDF documents

2021-12-20 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462615#comment-17462615 ] Tim Allison commented on TIKA-3629: --- May be caused by this? {noformat} * Remove dupl

[jira] [Closed] (TIKA-3628) Is tika 2.2 available

2021-12-20 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr closed TIKA-3628. - Resolution: Not A Bug > Is tika 2.2 available > - > > Key: TIK

[jira] [Reopened] (TIKA-3628) Is tika 2.2 available

2021-12-20 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr reopened TIKA-3628: --- > Is tika 2.2 available > - > > Key: TIKA-3628 >

[jira] [Updated] (TIKA-3627) OOXML parsing is not working as intended using multiple threads

2021-12-20 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-3627: -- Priority: Blocker (was: Major) > OOXML parsing is not working as intended using multiple threads >

[jira] [Commented] (TIKA-3627) OOXML parsing is not working as intended using multiple threads

2021-12-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462568#comment-17462568 ] ASF GitHub Bot commented on TIKA-3627: -- tballison merged pull request #467: URL: http

[GitHub] [tika] tballison merged pull request #467: [TIKA-3627] Fix OOXML parsing for multiple threads

2021-12-20 Thread GitBox
tballison merged pull request #467: URL: https://github.com/apache/tika/pull/467 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@t

[jira] [Commented] (TIKA-3628) Is tika 2.2 available

2021-12-20 Thread Konstantin Gribov (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462557#comment-17462557 ] Konstantin Gribov commented on TIKA-3628: - Great! For gradle-related help beside

[jira] [Comment Edited] (TIKA-3629) Keywords are not extracted anymore from PDF documents

2021-12-20 Thread David Pilato (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462555#comment-17462555 ] David Pilato edited comment on TIKA-3629 at 12/20/21, 11:32 AM:

[jira] [Closed] (TIKA-3628) Is tika 2.2 available

2021-12-20 Thread Vamsi Molli (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vamsi Molli closed TIKA-3628. - Resolution: Fixed > Is tika 2.2 available > - > > Key: TIKA-3628 >

[jira] [Commented] (TIKA-3628) Is tika 2.2 available

2021-12-20 Thread Vamsi Molli (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462556#comment-17462556 ] Vamsi Molli commented on TIKA-3628: --- Yes. Unchecked offline mode resolved. Thanks > Is

[jira] [Commented] (TIKA-3629) Keywords are not extracted anymore from PDF documents

2021-12-20 Thread David Pilato (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462555#comment-17462555 ] David Pilato commented on TIKA-3629: Update: I can see the keywords available in: *

[jira] [Created] (TIKA-3629) Keywords are not extracted anymore from PDF documents

2021-12-20 Thread David Pilato (Jira)
David Pilato created TIKA-3629: -- Summary: Keywords are not extracted anymore from PDF documents Key: TIKA-3629 URL: https://issues.apache.org/jira/browse/TIKA-3629 Project: Tika Issue Type: Bug

[jira] [Commented] (TIKA-3628) Is tika 2.2 available

2021-12-20 Thread Konstantin Gribov (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462541#comment-17462541 ] Konstantin Gribov commented on TIKA-3628: - {quote}No cached version of org.apache.

[jira] [Commented] (TIKA-3628) Is tika 2.2 available

2021-12-20 Thread Vamsi Molli (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462537#comment-17462537 ] Vamsi Molli commented on TIKA-3628: --- the definition build.Gradle file. Changing to 2.2.

[jira] [Updated] (TIKA-3628) Is tika 2.2 available

2021-12-20 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-3628: -- Component/s: (was: build) > Is tika 2.2 available > - > >

[jira] [Updated] (TIKA-3628) Is tika 2.2 available

2021-12-20 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-3628: -- Issue Type: Bug (was: New Feature) > Is tika 2.2 available > - > >

[jira] [Updated] (TIKA-3628) Is tika 2.2 available

2021-12-20 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-3628: -- Fix Version/s: (was: 2.1.0) > Is tika 2.2 available > - > >

[jira] [Commented] (TIKA-3628) Is tika 2.2 available

2021-12-20 Thread Konstantin Gribov (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462491#comment-17462491 ] Konstantin Gribov commented on TIKA-3628: - Yeah, I get it that you want to upgrade

[jira] [Comment Edited] (TIKA-3628) Is tika 2.2 available

2021-12-20 Thread Vamsi Molli (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462487#comment-17462487 ] Vamsi Molli edited comment on TIKA-3628 at 12/20/21, 10:00 AM: -

[jira] [Comment Edited] (TIKA-3628) Is tika 2.2 available

2021-12-20 Thread Vamsi Molli (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462485#comment-17462485 ] Vamsi Molli edited comment on TIKA-3628 at 12/20/21, 10:00 AM: -

[jira] [Commented] (TIKA-3628) Is tika 2.2 available

2021-12-20 Thread Konstantin Gribov (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462482#comment-17462482 ] Konstantin Gribov commented on TIKA-3628: - Maven Central has Tika 2.2.0: https://

[jira] [Created] (TIKA-3628) Is tika 2.2 available

2021-12-20 Thread Vamsi Molli (Jira)
Vamsi Molli created TIKA-3628: - Summary: Is tika 2.2 available Key: TIKA-3628 URL: https://issues.apache.org/jira/browse/TIKA-3628 Project: Tika Issue Type: New Feature Components: buil