[jira] [Created] (TIKA-4361) Rare RTF bug handling styles within an href in a malformed file

2024-12-04 Thread Tim Allison (Jira)
Tim Allison created TIKA-4361: - Summary: Rare RTF bug handling styles within an href in a malformed file Key: TIKA-4361 URL: https://issues.apache.org/jira/browse/TIKA-4361 Project: Tika Issue T

[PR] Add GoogleFetcher [tika]

2024-12-04 Thread via GitHub
bartek opened a new pull request, #2074: URL: https://github.com/apache/tika/pull/2074 This allows the fetching of items using files.get from Google Drive Thanks for your contribution to [Apache Tika](https://tika.apache.org/)! Your help is appreciated! Before opening t

Re: [PR] Add GoogleFetcher [tika]

2024-12-04 Thread via GitHub
tballison commented on code in PR #2074: URL: https://github.com/apache/tika/pull/2074#discussion_r1870215765 ## tika-pipes/tika-grpc/example-dockerfile/docker-build.sh: ## @@ -38,5 +38,10 @@ docker buildx create --name tikabuilder # see https://askubuntu.com/questions/1339558

Re: [PR] Add GoogleFetcher [tika]

2024-12-04 Thread via GitHub
bartek commented on PR #2074: URL: https://github.com/apache/tika/pull/2074#issuecomment-2518472444 @tballison Thanks for review! Honestly, I wasn't expecting one as I'm mostly pushing this to collaborate with @nddipiazza, however, if it makes sense to work as a group, then I will happily d

Re: [PR] Add GoogleFetcher [tika]

2024-12-04 Thread via GitHub
tballison commented on PR #2074: URL: https://github.com/apache/tika/pull/2074#issuecomment-2518481776 To the degree we can make small/logical changes in `main` to achieve the project's goals, all the better? This is definitely a standalone PR that can go straight into main, I think. --

Re: [PR] Add GoogleFetcher [tika]

2024-12-04 Thread via GitHub
bartek commented on PR #2074: URL: https://github.com/apache/tika/pull/2074#issuecomment-2518485233 > To the degree we can make small/logical changes in main to achieve the project's goals, all the better? This is definitely a standalone PR that can go straight into main, I think. So

[jira] [Commented] (TIKA-4361) Rare RTF bug handling styles within an href in a malformed file

2024-12-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17903109#comment-17903109 ] ASF GitHub Bot commented on TIKA-4361: -- tballison opened a new pull request, #2075: U

Re: [PR] TIKA-4360 -- improve extraction of mapi metadata [tika]

2024-12-04 Thread via GitHub
tballison merged PR #2073: URL: https://github.com/apache/tika/pull/2073 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

[PR] TIKA-4361 [tika]

2024-12-04 Thread via GitHub
tballison opened a new pull request, #2075: URL: https://github.com/apache/tika/pull/2075 Thanks for your contribution to [Apache Tika](https://tika.apache.org/)! Your help is appreciated! Before opening the pull request, please verify that * there is an open issue on the [

[jira] [Commented] (TIKA-4360) Extract more granular information from MAPI/MSG files

2024-12-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17903110#comment-17903110 ] ASF GitHub Bot commented on TIKA-4360: -- tballison merged PR #2073: URL: https://githu

Re: [PR] TIKA-4357 -- improve metadata key prefixing for PDFs and html [tika]

2024-12-04 Thread via GitHub
tballison merged PR #2061: URL: https://github.com/apache/tika/pull/2061 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

[jira] [Commented] (TIKA-4357) Ensure namespace prefixes in metadata keys in 4.x

2024-12-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17903117#comment-17903117 ] ASF GitHub Bot commented on TIKA-4357: -- tballison merged PR #2061: URL: https://githu

Re: [PR] TIKA-4361 [tika]

2024-12-04 Thread via GitHub
tballison merged PR #2075: URL: https://github.com/apache/tika/pull/2075 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

[jira] [Commented] (TIKA-4355) Fix LibPstParser problems when run with ForkParser

2024-12-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17903118#comment-17903118 ] ASF GitHub Bot commented on TIKA-4355: -- tballison merged PR #2060: URL: https://githu

[jira] [Commented] (TIKA-4361) Rare RTF bug handling styles within an href in a malformed file

2024-12-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17903119#comment-17903119 ] ASF GitHub Bot commented on TIKA-4361: -- tballison merged PR #2075: URL: https://githu

[jira] [Resolved] (TIKA-4355) Fix LibPstParser problems when run with ForkParser

2024-12-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-4355. --- Fix Version/s: 4.0.0 3.1.0 Resolution: Fixed I _think_ we're all set. Please

[jira] [Resolved] (TIKA-4361) Rare RTF bug handling styles within an href in a malformed file

2024-12-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-4361. --- Fix Version/s: 4.0.0 3.1.0 Resolution: Fixed > Rare RTF bug handling styles

Re: [PR] TIKA-4355 -- LibPstParserConfig should be serializable [tika]

2024-12-04 Thread via GitHub
tballison merged PR #2060: URL: https://github.com/apache/tika/pull/2060 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

[jira] [Created] (TIKA-4362) Improve message class coverage for msg

2024-12-04 Thread Tim Allison (Jira)
Tim Allison created TIKA-4362: - Summary: Improve message class coverage for msg Key: TIKA-4362 URL: https://issues.apache.org/jira/browse/TIKA-4362 Project: Tika Issue Type: Improvement

[jira] [Commented] (TIKA-4362) Improve message class coverage for msg

2024-12-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17903136#comment-17903136 ] Tim Allison commented on TIKA-4362: --- https://bz.apache.org/bugzilla/show_bug.cgi?id=6948

[jira] [Commented] (TIKA-4355) Fix LibPstParser problems when run with ForkParser

2024-12-04 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17903147#comment-17903147 ] Hudson commented on TIKA-4355: -- ABORTED: Integrated in Jenkins build Tika » tika-branch_3x-jd

[jira] [Commented] (TIKA-4361) Rare RTF bug handling styles within an href in a malformed file

2024-12-04 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17903146#comment-17903146 ] Hudson commented on TIKA-4361: -- ABORTED: Integrated in Jenkins build Tika » tika-branch_3x-jd

Re: [PR] Add GoogleFetcher [tika]

2024-12-04 Thread via GitHub
bartek commented on code in PR #2074: URL: https://github.com/apache/tika/pull/2074#discussion_r1870507803 ## tika-pipes/tika-fetchers/tika-fetcher-google/pom.xml: ## @@ -0,0 +1,96 @@ +http://maven.apache.org/POM/4.0.0"; + xmlns:xsi="http://www.w3.org/2001/XMLSchema-inst