[jira] [Created] (TIKA-4357) Ensure namespace prefixes in metadata keys in 4.x

2024-11-21 Thread Tim Allison (Jira)
Tim Allison created TIKA-4357: - Summary: Ensure namespace prefixes in metadata keys in 4.x Key: TIKA-4357 URL: https://issues.apache.org/jira/browse/TIKA-4357 Project: Tika Issue Type: Task

[jira] [Updated] (TIKA-4357) Ensure namespace prefixes in metadata keys in 4.x

2024-11-21 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-4357: -- Description: There are several places in the codebase where we are mindlessly trusting a file's metadat

[jira] [Commented] (TIKA-4358) Extract incremental update info by default in PDFs in 4.x

2024-11-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17900085#comment-17900085 ] ASF GitHub Bot commented on TIKA-4358: -- tballison merged PR #2062: URL: https://githu

Re: [PR] TIKA-4358 -- turn on extraction of incremental update metadata as default [tika]

2024-11-21 Thread via GitHub
tballison merged PR #2062: URL: https://github.com/apache/tika/pull/2062 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

[jira] [Resolved] (TIKA-4358) Extract incremental update info by default in PDFs in 4.x

2024-11-21 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-4358. --- Fix Version/s: 4.0.0 Resolution: Fixed > Extract incremental update info by default in PDFs in

[jira] [Created] (TIKA-4358) Extract incremental update info by default in PDFs in 4.x

2024-11-21 Thread Tim Allison (Jira)
Tim Allison created TIKA-4358: - Summary: Extract incremental update info by default in PDFs in 4.x Key: TIKA-4358 URL: https://issues.apache.org/jira/browse/TIKA-4358 Project: Tika Issue Type: Ta

[jira] [Updated] (TIKA-4357) Ensure namespace prefixes in metadata keys in 4.x

2024-11-21 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-4357: -- Labels: 4x (was: ) > Ensure namespace prefixes in metadata keys in 4.x > --

[PR] TIKA-4358 -- turn on extraction of incremental update metadata as default [tika]

2024-11-21 Thread via GitHub
tballison opened a new pull request, #2062: URL: https://github.com/apache/tika/pull/2062 Thanks for your contribution to [Apache Tika](https://tika.apache.org/)! Your help is appreciated! Before opening the pull request, please verify that * there is an open issue on the [

[jira] [Commented] (TIKA-4358) Extract incremental update info by default in PDFs in 4.x

2024-11-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17900055#comment-17900055 ] ASF GitHub Bot commented on TIKA-4358: -- tballison opened a new pull request, #2062: U

[jira] [Commented] (TIKA-4356) tika core content detection possible regression for subtypes

2024-11-21 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17900058#comment-17900058 ] Tim Allison commented on TIKA-4356: --- Thank you for opening this issue. I'm not able to

[jira] [Commented] (TIKA-4356) tika core content detection possible regression for subtypes

2024-11-21 Thread Subbu (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17900057#comment-17900057 ] Subbu commented on TIKA-4356: - I am able to reproduce this issue. Can I pick this to analyse a

[jira] [Commented] (TIKA-4356) tika core content detection possible regression for subtypes

2024-11-21 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17900062#comment-17900062 ] Tim Allison commented on TIKA-4356: --- [~subbudvk], if you're able to replicate this in th

[jira] [Commented] (TIKA-4356) tika core content detection possible regression for subtypes

2024-11-21 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17900059#comment-17900059 ] Tim Allison commented on TIKA-4356: --- Are you able to check with a SNAPSHOT and see if yo

[jira] [Commented] (TIKA-4239) Update to 2.9.3

2024-11-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17900106#comment-17900106 ] Hudson commented on TIKA-4239: -- UNSTABLE: Integrated in Jenkins build Tika » tika-branch_2x-j

[PR] TIKA-4357 -- improve metadata key prefixing for PDFs and html [tika]

2024-11-21 Thread via GitHub
tballison opened a new pull request, #2061: URL: https://github.com/apache/tika/pull/2061 Thanks for your contribution to [Apache Tika](https://tika.apache.org/)! Your help is appreciated! Before opening the pull request, please verify that * there is an open issue on the [

[jira] [Commented] (TIKA-4357) Ensure namespace prefixes in metadata keys in 4.x

2024-11-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17900051#comment-17900051 ] ASF GitHub Bot commented on TIKA-4357: -- tballison opened a new pull request, #2061: U

[jira] [Commented] (TIKA-4357) Ensure namespace prefixes in metadata keys in 4.x

2024-11-21 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17900052#comment-17900052 ] Tim Allison commented on TIKA-4357: --- I just updated the parsers for pdf and html. I need

[jira] [Updated] (TIKA-4357) Ensure namespace prefixes in metadata keys in 4.x

2024-11-21 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-4357: -- Description: There are several places in the codebase where we are mindlessly trusting a file's metadat

[jira] [Commented] (TIKA-4326) General updates for 3.0.1

2024-11-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17900122#comment-17900122 ] Hudson commented on TIKA-4326: -- SUCCESS: Integrated in Jenkins build Tika » tika-branch_3x-jd

[jira] [Commented] (TIKA-4239) Update to 2.9.3

2024-11-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17900127#comment-17900127 ] Hudson commented on TIKA-4239: -- UNSTABLE: Integrated in Jenkins build Tika » tika-branch_2x-j

[jira] [Commented] (TIKA-4239) Update to 2.9.3

2024-11-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17900245#comment-17900245 ] Hudson commented on TIKA-4239: -- UNSTABLE: Integrated in Jenkins build Tika » tika-branch_2x-j

[jira] [Updated] (TIKA-4359) Assistance transferring ownership of Tika artifacthub.io entry to Apache Org

2024-11-21 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-4359: --- Summary: Assistance transferring ownership of Tika artifacthub.io entry to Apache Org

[jira] [Created] (TIKA-4359) Assistance transfering ownership of artifact.io

2024-11-21 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created TIKA-4359: -- Summary: Assistance transfering ownership of artifact.io Key: TIKA-4359 URL: https://issues.apache.org/jira/browse/TIKA-4359 Project: Tika Issue

[jira] [Commented] (TIKA-4326) General updates for 3.0.1

2024-11-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17900254#comment-17900254 ] Hudson commented on TIKA-4326: -- SUCCESS: Integrated in Jenkins build Tika » tika-branch_3x-jd

[jira] [Commented] (TIKA-4326) General updates for 3.0.1

2024-11-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17900146#comment-17900146 ] Hudson commented on TIKA-4326: -- SUCCESS: Integrated in Jenkins build Tika » tika-branch_3x-jd

[jira] [Commented] (TIKA-4357) Ensure namespace prefixes in metadata keys in 4.x

2024-11-21 Thread Peter Wyatt (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17900196#comment-17900196 ] Peter Wyatt commented on TIKA-4357: --- For PDF XMP ({*}Metadata{*} streams), can the names