Re: [PR] TIKA-4278 -- remove colon from default and allow users to customize d… [tika]

2024-10-11 Thread via GitHub
THausherr commented on PR #1976: URL: https://github.com/apache/tika/pull/1976#issuecomment-2407325929 I see you removed the "colon isn't reliable" code part. Did you test what would happen with the file mentioned in that code segment (242970.txt)? IMHO the colon should still be "discrimina

Re: [PR] TIKA-4278 -- remove colon from default and allow users to customize d… [tika]

2024-10-11 Thread via GitHub
THausherr commented on PR #1976: URL: https://github.com/apache/tika/pull/1976#issuecomment-2407819859 I forgot to mention, the test with the file mentioned works nicely! Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[jira] [Commented] (TIKA-4278) TextAndCSVParser doesn't detect semicolon separated file

2024-10-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888719#comment-17888719 ] ASF GitHub Bot commented on TIKA-4278: -- THausherr commented on PR #1976: URL: https:/

[jira] [Commented] (TIKA-4278) TextAndCSVParser doesn't detect semicolon separated file

2024-10-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888618#comment-17888618 ] ASF GitHub Bot commented on TIKA-4278: -- tballison opened a new pull request, #1976: U

[PR] TIKA-4278 -- remove colon from default and allow users to customize d… [tika]

2024-10-11 Thread via GitHub
tballison opened a new pull request, #1976: URL: https://github.com/apache/tika/pull/1976 …elimiters Thanks for your contribution to [Apache Tika](https://tika.apache.org/)! Your help is appreciated! Before opening the pull request, please verify that * there is an o

[jira] [Updated] (TIKA-4280) Tasks for the 3.0.0 release

2024-10-11 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-4280: -- Attachment: 2PSMEFJEYU7EPAZXQQDD6OL2WOQLBJRY.zip > Tasks for the 3.0.0 release > ---

[jira] [Commented] (TIKA-4280) Tasks for the 3.0.0 release

2024-10-11 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888620#comment-17888620 ] Tim Allison commented on TIKA-4280: --- That's an artifact of the report. I need to fix tha

[jira] [Commented] (TIKA-4278) TextAndCSVParser doesn't detect semicolon separated file

2024-10-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888663#comment-17888663 ] ASF GitHub Bot commented on TIKA-4278: -- THausherr merged PR #1976: URL: https://githu

[jira] [Updated] (TIKA-4278) TextAndCSVParser doesn't detect semicolon separated file

2024-10-11 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-4278: -- Attachment: reports_csv_3.0.0_vs_3.0.0_new_withcolon.tar.xz > TextAndCSVParser doesn't detect se

Re: [PR] TIKA-4278 -- remove colon from default and allow users to customize d… [tika]

2024-10-11 Thread via GitHub
THausherr merged PR #1976: URL: https://github.com/apache/tika/pull/1976 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

[jira] [Commented] (TIKA-4278) TextAndCSVParser doesn't detect semicolon separated file

2024-10-11 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888662#comment-17888662 ] Tilman Hausherr commented on TIKA-4278: --- new test result with the latest changes and

[jira] [Commented] (TIKA-4170) Tika to extract Apple Key files

2024-10-11 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888623#comment-17888623 ] Tim Allison commented on TIKA-4170: --- Should have time this afternoon or early next week.

[jira] [Created] (TIKA-4316) Goals for tika 4.x

2024-10-11 Thread Tim Allison (Jira)
Tim Allison created TIKA-4316: - Summary: Goals for tika 4.x Key: TIKA-4316 URL: https://issues.apache.org/jira/browse/TIKA-4316 Project: Tika Issue Type: Task Reporter: Tim Allison

[jira] [Updated] (TIKA-4316) Goals for Tika 4.x

2024-10-11 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-4316: -- Summary: Goals for Tika 4.x (was: Goals for tika 4.x) > Goals for Tika 4.x > -- > >

[jira] [Updated] (TIKA-4316) Goals for tika 4.x

2024-10-11 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-4316: -- Description: I proposed a tentative roadmap here: https://lists.apache.org/thread/9yfzf6qwpc7c6qnlp4tdw

[jira] [Commented] (TIKA-4316) Goals for Tika 4.x

2024-10-11 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888628#comment-17888628 ] Tim Allison commented on TIKA-4316: --- All of this is still open for discussion, and I loo

[jira] [Comment Edited] (TIKA-4316) Goals for Tika 4.x

2024-10-11 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888628#comment-17888628 ] Tim Allison edited comment on TIKA-4316 at 10/11/24 12:29 PM: --

Re: [PR] TIKA-4278 -- remove colon from default and allow users to customize d… [tika]

2024-10-11 Thread via GitHub
tballison commented on PR #1976: URL: https://github.com/apache/tika/pull/1976#issuecomment-2407368869 Y, I figured that if someone turned on the colon detection, that was their choice. I'm happy to put that logic back in. If we do so, I'd like to modify it slightly so that we check a tiny

[jira] [Commented] (TIKA-4278) TextAndCSVParser doesn't detect semicolon separated file

2024-10-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888642#comment-17888642 ] ASF GitHub Bot commented on TIKA-4278: -- tballison commented on PR #1976: URL: https:/

Re: [PR] TIKA-4278 -- remove colon from default and allow users to customize d… [tika]

2024-10-11 Thread via GitHub
tballison commented on PR #1976: URL: https://github.com/apache/tika/pull/1976#issuecomment-2407374002 I put the check back in. Let me know what you find. Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[jira] [Commented] (TIKA-4278) TextAndCSVParser doesn't detect semicolon separated file

2024-10-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888646#comment-17888646 ] ASF GitHub Bot commented on TIKA-4278: -- tballison commented on PR #1976: URL: https:/

[jira] [Commented] (TIKA-4166) dependency updates for Tika 3.0

2024-10-11 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888647#comment-17888647 ] Hudson commented on TIKA-4166: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #

[jira] [Updated] (TIKA-4278) TextAndCSVParser doesn't detect semicolon separated file

2024-10-11 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-4278: -- Attachment: reports_csv_3.0.0_vs_3.0.0_nocolon.tar.xz > TextAndCSVParser doesn't detect semicolo

[jira] [Commented] (TIKA-4278) TextAndCSVParser doesn't detect semicolon separated file

2024-10-11 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888651#comment-17888651 ] Tilman Hausherr commented on TIKA-4278: --- Here's the test result: [^reports_csv_3.0.0

[jira] [Commented] (TIKA-4278) TextAndCSVParser doesn't detect semicolon separated file

2024-10-11 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888706#comment-17888706 ] Hudson commented on TIKA-4278: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #

Re: Is there a way to publish to docker.io/apache ?

2024-10-11 Thread Tim Allison
Last I looked into this, I think infra granted it to three (?) people per project. Maybe check with them and see if that still applies and then see who has karma that might be willing to relinquish it? On Fri, Oct 11, 2024 at 12:22 PM Nicholas DiPiazza < nicholas.dipia...@gmail.com> wrote: > I ha

Is there a way to publish to docker.io/apache ?

2024-10-11 Thread Nicholas DiPiazza
I have an image of Apache Tika Grpc that is on Dockerhub here: ndipiazza/tika-grpc:3.0.0-BETA2 I have some interest in putting that in an official docker.io/apache/tika-grpc:3.0.0-BETA2 Is this possible to get a dockerhub account for my apache credentials?

[jira] [Comment Edited] (TIKA-4278) TextAndCSVParser doesn't detect semicolon separated file

2024-10-11 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888651#comment-17888651 ] Tilman Hausherr edited comment on TIKA-4278 at 10/11/24 1:16 PM: ---

[jira] [Comment Edited] (TIKA-4278) TextAndCSVParser doesn't detect semicolon separated file

2024-10-11 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888651#comment-17888651 ] Tilman Hausherr edited comment on TIKA-4278 at 10/11/24 1:17 PM: ---

[jira] [Commented] (TIKA-4278) TextAndCSVParser doesn't detect semicolon separated file

2024-10-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888632#comment-17888632 ] ASF GitHub Bot commented on TIKA-4278: -- THausherr commented on PR #1976: URL: https:/