[jira] [Commented] (TIKA-4243) tika configuration overhaul

2024-04-29 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17842158#comment-17842158 ] Nicholas DiPiazza commented on TIKA-4243: - this seems like a major feature thing s

[jira] [Comment Edited] (TIKA-4243) tika configuration overhaul

2024-04-29 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17842158#comment-17842158 ] Nicholas DiPiazza edited comment on TIKA-4243 at 4/29/24 8:56 PM: --

[jira] [Comment Edited] (TIKA-4243) tika configuration overhaul

2024-05-01 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17842622#comment-17842622 ] Nicholas DiPiazza edited comment on TIKA-4243 at 5/1/24 12:34 PM: --

[jira] [Commented] (TIKA-4243) tika configuration overhaul

2024-05-01 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17842622#comment-17842622 ] Nicholas DiPiazza commented on TIKA-4243: - Kinda seems like it might belong in tik

[jira] [Created] (TIKA-4252) PipesClient#process - seems to lose the Fetch input metadata?

2024-05-09 Thread Nicholas DiPiazza (Jira)
Nicholas DiPiazza created TIKA-4252: --- Summary: PipesClient#process - seems to lose the Fetch input metadata? Key: TIKA-4252 URL: https://issues.apache.org/jira/browse/TIKA-4252 Project: Tika

[jira] [Updated] (TIKA-4252) PipesClient#process - seems to lose the Fetch input metadata?

2024-05-09 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-4252: Description: when calling: PipesResult pipesResult = pipesClient.process(new FetchEmitTupl

[jira] [Updated] (TIKA-4252) PipesClient#process - seems to lose the Fetch input metadata?

2024-05-09 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-4252: Description: when calling: PipesResult pipesResult = pipesClient.process(new FetchEmitTupl

[jira] [Commented] (TIKA-4252) PipesClient#process - seems to lose the Fetch input metadata?

2024-05-09 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845010#comment-17845010 ] Nicholas DiPiazza commented on TIKA-4252: - done > PipesClient#process - seems to

[jira] [Closed] (TIKA-4252) PipesClient#process - seems to lose the Fetch input metadata?

2024-05-09 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza closed TIKA-4252. --- Fix Version/s: 3.0.0 Resolution: Fixed > PipesClient#process - seems to lose the Fetch

[jira] [Commented] (TIKA-4252) PipesClient#process - seems to lose the Fetch input metadata?

2024-05-09 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845061#comment-17845061 ] Nicholas DiPiazza commented on TIKA-4252: - What I need is to be able to send "Fetc

[jira] [Comment Edited] (TIKA-4252) PipesClient#process - seems to lose the Fetch input metadata?

2024-05-09 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845061#comment-17845061 ] Nicholas DiPiazza edited comment on TIKA-4252 at 5/9/24 4:50 PM: ---

[jira] [Comment Edited] (TIKA-4252) PipesClient#process - seems to lose the Fetch input metadata?

2024-05-09 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845061#comment-17845061 ] Nicholas DiPiazza edited comment on TIKA-4252 at 5/9/24 4:50 PM: ---

[jira] [Comment Edited] (TIKA-4252) PipesClient#process - seems to lose the Fetch input metadata?

2024-05-09 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845071#comment-17845071 ] Nicholas DiPiazza edited comment on TIKA-4252 at 5/9/24 5:08 PM: ---

[jira] [Commented] (TIKA-4252) PipesClient#process - seems to lose the Fetch input metadata?

2024-05-09 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845071#comment-17845071 ] Nicholas DiPiazza commented on TIKA-4252: - sure I can do that. > PipesClient#proc

[jira] [Commented] (TIKA-4252) PipesClient#process - seems to lose the Fetch input metadata?

2024-05-09 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845080#comment-17845080 ] Nicholas DiPiazza commented on TIKA-4252: - Maybe   fetchInputMetadata outputMet

[jira] [Commented] (TIKA-4252) PipesClient#process - seems to lose the Fetch input metadata?

2024-05-09 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845083#comment-17845083 ] Nicholas DiPiazza commented on TIKA-4252: - even better > PipesClient#process - se

[jira] [Commented] (TIKA-4243) tika configuration overhaul

2024-05-23 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848960#comment-17848960 ] Nicholas DiPiazza commented on TIKA-4243: - Sure that sounds good. When we chat lat

[jira] [Created] (TIKA-4262) In pipes XML config, List serializes incorrect causing the parameters to be empty when read

2024-05-26 Thread Nicholas DiPiazza (Jira)
Nicholas DiPiazza created TIKA-4262: --- Summary: In pipes XML config, List serializes incorrect causing the parameters to be empty when read Key: TIKA-4262 URL: https://issues.apache.org/jira/browse/TIKA-4262

[jira] [Updated] (TIKA-4262) In pipes XML config, List serializes incorrect causing the parameters to be empty when read

2024-05-26 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-4262: Description: tika configuration when saving a fetcher with a list of strings will look like

[jira] [Closed] (TIKA-4262) In pipes XML config, List serializes incorrect causing the parameters to be empty when read

2024-05-26 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza closed TIKA-4262. --- Assignee: Nicholas DiPiazza Resolution: Invalid never mind - this was an issue in my bra

[jira] [Created] (TIKA-4264) Tika Pipes - Structured output (XHTML) support?

2024-05-28 Thread Nicholas DiPiazza (Jira)
Nicholas DiPiazza created TIKA-4264: --- Summary: Tika Pipes - Structured output (XHTML) support? Key: TIKA-4264 URL: https://issues.apache.org/jira/browse/TIKA-4264 Project: Tika Issue Type:

[jira] [Resolved] (TIKA-4243) tika configuration overhaul

2024-06-06 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza resolved TIKA-4243. - Fix Version/s: 3.0.0 Resolution: Fixed > tika configuration overhaul >

[jira] [Commented] (TIKA-4243) tika configuration overhaul

2024-06-06 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17852895#comment-17852895 ] Nicholas DiPiazza commented on TIKA-4243: -  new ticket .let's close this out > ti

[jira] [Commented] (TIKA-4251) [DISCUSS] move to cosium's git-code-format-maven-plugin with google-java-format

2024-06-24 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17859757#comment-17859757 ] Nicholas DiPiazza commented on TIKA-4251: - we could keep everything how it is but:

[jira] [Comment Edited] (TIKA-4251) [DISCUSS] move to cosium's git-code-format-maven-plugin with google-java-format

2024-06-24 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17859757#comment-17859757 ] Nicholas DiPiazza edited comment on TIKA-4251 at 6/24/24 6:35 PM: --

[jira] [Updated] (TIKA-4181) Grpc + Tika Pipes

2024-06-24 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-4181: Summary: Grpc + Tika Pipes (was: Grpc + Tika Pipes - pipe iterator and emitter) > Grpc + T

[jira] [Updated] (TIKA-4181) Grpc + Tika Pipes

2024-06-24 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-4181: Description: Create a Tika Grpc server. You should be able to create Tike Pipes fetchers, t

[jira] [Updated] (TIKA-4181) Tika Grpc Server using Tika Pipes

2024-06-24 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-4181: Summary: Tika Grpc Server using Tika Pipes (was: Grpc + Tika Pipes) > Tika Grpc Server usin

[jira] [Commented] (TIKA-4229) add microsoft graph fetcher

2024-06-25 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17859980#comment-17859980 ] Nicholas DiPiazza commented on TIKA-4229: - Will be merging this shortly. if anyone

[jira] [Commented] (TIKA-4237) Add JWT authentication ability to the http fetcher

2024-06-25 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17859984#comment-17859984 ] Nicholas DiPiazza commented on TIKA-4237: - i will be merging this shortly. any iss

[jira] [Commented] (TIKA-4247) HttpFetcher - add ability to send request headers

2024-06-25 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17859985#comment-17859985 ] Nicholas DiPiazza commented on TIKA-4247: - I will be merging this today. any follo

[jira] [Commented] (TIKA-4181) Tika Grpc Server using Tika Pipes

2024-06-25 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17859987#comment-17859987 ] Nicholas DiPiazza commented on TIKA-4181: - I will be merging this today. any issue

[jira] [Commented] (TIKA-4251) [DISCUSS] move to cosium's git-code-format-maven-plugin with google-java-format

2024-06-25 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17860004#comment-17860004 ] Nicholas DiPiazza commented on TIKA-4251: - I think as long as the plugin isn't tra

[jira] [Comment Edited] (TIKA-4251) [DISCUSS] move to cosium's git-code-format-maven-plugin with google-java-format

2024-06-25 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17860004#comment-17860004 ] Nicholas DiPiazza edited comment on TIKA-4251 at 6/25/24 6:28 PM: --

[jira] [Commented] (TIKA-4251) [DISCUSS] move to cosium's git-code-format-maven-plugin with google-java-format

2024-06-25 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17860005#comment-17860005 ] Nicholas DiPiazza commented on TIKA-4251: - i guess we don't even need the maven pl

[jira] [Comment Edited] (TIKA-4251) [DISCUSS] move to cosium's git-code-format-maven-plugin with google-java-format

2024-06-25 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17860005#comment-17860005 ] Nicholas DiPiazza edited comment on TIKA-4251 at 6/25/24 6:30 PM: --

[jira] [Comment Edited] (TIKA-4251) [DISCUSS] move to cosium's git-code-format-maven-plugin with google-java-format

2024-06-25 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17860005#comment-17860005 ] Nicholas DiPiazza edited comment on TIKA-4251 at 6/25/24 6:42 PM: --

[jira] [Commented] (TIKA-4251) [DISCUSS] move to cosium's git-code-format-maven-plugin with google-java-format

2024-06-25 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17860011#comment-17860011 ] Nicholas DiPiazza commented on TIKA-4251: - I volunteer to review the PR thoroughly

[jira] [Commented] (TIKA-4251) [DISCUSS] move to cosium's git-code-format-maven-plugin with google-java-format

2024-06-25 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17860032#comment-17860032 ] Nicholas DiPiazza commented on TIKA-4251: - I agree with Google format being the ne

[jira] [Created] (TIKA-4272) create an image for tika-grpc-server

2024-06-26 Thread Nicholas DiPiazza (Jira)
Nicholas DiPiazza created TIKA-4272: --- Summary: create an image for tika-grpc-server Key: TIKA-4272 URL: https://issues.apache.org/jira/browse/TIKA-4272 Project: Tika Issue Type: New Feature

[jira] [Created] (TIKA-4273) create a helm deployment for tika-grpc

2024-06-26 Thread Nicholas DiPiazza (Jira)
Nicholas DiPiazza created TIKA-4273: --- Summary: create a helm deployment for tika-grpc Key: TIKA-4273 URL: https://issues.apache.org/jira/browse/TIKA-4273 Project: Tika Issue Type: New Featu

[jira] [Updated] (TIKA-4273) create a helm deployment for tika-grpc

2024-06-26 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-4273: Description: after we have created a tika-grpc image, we need to create a deployment in the

[jira] [Updated] (TIKA-4272) create a Docker image for tika-grpc-server

2024-06-26 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-4272: Summary: create a Docker image for tika-grpc-server (was: create an image for tika-grpc-ser

[jira] [Updated] (TIKA-4272) make changes to tika docker image so that tika can run grpc server or rest server

2024-06-26 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-4272: Summary: make changes to tika docker image so that tika can run grpc server or rest server

[jira] [Updated] (TIKA-4272) create tika docker image for tika-grpc

2024-06-26 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-4272: Summary: create tika docker image for tika-grpc (was: make changes to tika docker image so

[jira] [Updated] (TIKA-4272) make changes to tika docker image so that tika can run grpc server or rest server

2024-06-26 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-4272: Description: now that the tika-grpc branch has been merge to main, tika-docker image needs t

[jira] [Updated] (TIKA-4272) create tika docker image for tika-grpc

2024-06-26 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-4272: Description: now that the tika-grpc branch has been merge to main, we need a tika-grpc serv

[jira] [Created] (TIKA-4286) fix issues where MS graph fetcher is missing deps

2024-07-22 Thread Nicholas DiPiazza (Jira)
Nicholas DiPiazza created TIKA-4286: --- Summary: fix issues where MS graph fetcher is missing deps Key: TIKA-4286 URL: https://issues.apache.org/jira/browse/TIKA-4286 Project: Tika Issue Type

[jira] [Commented] (TIKA-4280) Tasks for the 3.0.0 release

2024-07-25 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868678#comment-17868678 ] Nicholas DiPiazza commented on TIKA-4280: - So for tika server we normally produced

[jira] [Created] (TIKA-2043) junrar tika outofmemoryerror

2016-07-27 Thread Nicholas DiPiazza (JIRA)
Nicholas DiPiazza created TIKA-2043: --- Summary: junrar tika outofmemoryerror Key: TIKA-2043 URL: https://issues.apache.org/jira/browse/TIKA-2043 Project: Tika Issue Type: Bug Rep

[jira] [Updated] (TIKA-2043) junrar tika outofmemoryerror

2016-07-27 Thread Nicholas DiPiazza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-2043: Description: I see common junrar related OOM errors how can i prevent them? It loaded a 2GB

[jira] [Comment Edited] (TIKA-2232) Add JBIG2 image parsing support

2017-01-13 Thread Nicholas DiPiazza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15822136#comment-15822136 ] Nicholas DiPiazza edited comment on TIKA-2232 at 1/13/17 6:39 PM: ---

[jira] [Commented] (TIKA-2232) Add JBIG2 image parsing support

2017-01-13 Thread Nicholas DiPiazza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15822136#comment-15822136 ] Nicholas DiPiazza commented on TIKA-2232: - [~pascal.essiembre] totally obviously w

[jira] [Created] (TIKA-2805) Should the HTML parser by default just ignore the section?

2019-01-05 Thread Nicholas DiPiazza (JIRA)
Nicholas DiPiazza created TIKA-2805: --- Summary: Should the HTML parser by default just ignore the section? Key: TIKA-2805 URL: https://issues.apache.org/jira/browse/TIKA-2805 Project: Tika

[jira] [Updated] (TIKA-2805) Should the HTML parser by default just ignore the section?

2019-01-05 Thread Nicholas DiPiazza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-2805: Description: The tika's HTML parser will take this: {code:java} You may be trying to access

[jira] [Comment Edited] (TIKA-2224) Mime magic for OneNote formats

2019-01-14 Thread Nicholas DiPiazza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16742717#comment-16742717 ] Nicholas DiPiazza edited comment on TIKA-2224 at 1/15/19 3:34 AM: --

[jira] [Updated] (TIKA-2224) Mime magic for OneNote formats

2019-01-14 Thread Nicholas DiPiazza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-2224: Attachment: Sample1.json > Mime magic for OneNote formats > -- >

[jira] [Commented] (TIKA-2224) Mime magic for OneNote formats

2019-01-14 Thread Nicholas DiPiazza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16742717#comment-16742717 ] Nicholas DiPiazza commented on TIKA-2224: - Where are we at with this? There is a

[jira] [Comment Edited] (TIKA-2224) Mime magic for OneNote formats

2019-01-14 Thread Nicholas DiPiazza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16742717#comment-16742717 ] Nicholas DiPiazza edited comment on TIKA-2224 at 1/15/19 3:46 AM: --

[jira] [Commented] (TIKA-2224) Mime magic for OneNote formats

2019-01-15 Thread Nicholas DiPiazza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16743338#comment-16743338 ] Nicholas DiPiazza commented on TIKA-2224: - well we can use the c++ executable, or

[jira] [Commented] (TIKA-2224) Mime magic for OneNote formats

2019-01-15 Thread Nicholas DiPiazza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16743415#comment-16743415 ] Nicholas DiPiazza commented on TIKA-2224: - i'm not concerned about the external ca

[jira] [Comment Edited] (TIKA-2224) Mime magic for OneNote formats

2019-01-15 Thread Nicholas DiPiazza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16743415#comment-16743415 ] Nicholas DiPiazza edited comment on TIKA-2224 at 1/15/19 9:37 PM: --

[jira] [Commented] (TIKA-2575) Provide a way to abort tika parses when tika input stream buffer grows passed a certain threshold

2019-05-29 Thread Nicholas DiPiazza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16850870#comment-16850870 ] Nicholas DiPiazza commented on TIKA-2575: - Hi [~talli...@apache.org] i created a p

[jira] [Comment Edited] (TIKA-2575) Provide a way to abort tika parses when tika input stream buffer grows passed a certain threshold

2019-05-29 Thread Nicholas DiPiazza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16850870#comment-16850870 ] Nicholas DiPiazza edited comment on TIKA-2575 at 5/29/19 2:19 PM: --

[jira] [Commented] (TIKA-2575) Provide a way to abort tika parses when tika input stream buffer grows passed a certain threshold

2019-05-29 Thread Nicholas DiPiazza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851293#comment-16851293 ] Nicholas DiPiazza commented on TIKA-2575: - i wasn't aware of it. i'll take a looks

[jira] [Commented] (TIKA-2575) Provide a way to abort tika parses when tika input stream buffer grows passed a certain threshold

2019-07-18 Thread Nicholas DiPiazza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888440#comment-16888440 ] Nicholas DiPiazza commented on TIKA-2575: - Hey [~talli...@apache.org] I ended up g

[jira] [Comment Edited] (TIKA-2575) Provide a way to abort tika parses when tika input stream buffer grows passed a certain threshold

2019-07-18 Thread Nicholas DiPiazza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888440#comment-16888440 ] Nicholas DiPiazza edited comment on TIKA-2575 at 7/19/19 1:53 AM: --

[jira] [Commented] (TIKA-2224) Mime magic for OneNote formats

2019-11-18 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976517#comment-16976517 ] Nicholas DiPiazza commented on TIKA-2224: - [~tallison] hi. I converted the project

[jira] [Commented] (TIKA-2224) Mime magic for OneNote formats

2019-11-18 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976518#comment-16976518 ] Nicholas DiPiazza commented on TIKA-2224: - This jira is about "Mime magic for onen

[jira] [Commented] (TIKA-2224) Mime magic for OneNote formats

2019-11-18 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976586#comment-16976586 ] Nicholas DiPiazza commented on TIKA-2224: - [~nick] ok let's use this jira then. i

[jira] [Commented] (TIKA-2224) OneNote formats support - Mime Magic and Parser

2019-11-18 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976688#comment-16976688 ] Nicholas DiPiazza commented on TIKA-2224: - thanks > OneNote formats support - Mim

[jira] [Commented] (TIKA-2224) OneNote formats support - Mime Magic and Parser

2019-11-24 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16981144#comment-16981144 ] Nicholas DiPiazza commented on TIKA-2224: - Dear watchers of this issue: I am wor

[jira] [Commented] (TIKA-2224) OneNote formats support - Mime Magic and Parser

2019-12-04 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16988049#comment-16988049 ] Nicholas DiPiazza commented on TIKA-2224: - OK i've got it working now. it's parsin

[jira] [Comment Edited] (TIKA-2224) OneNote formats support - Mime Magic and Parser

2019-12-04 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16988049#comment-16988049 ] Nicholas DiPiazza edited comment on TIKA-2224 at 12/4/19 6:09 PM: --

[jira] [Commented] (TIKA-2224) OneNote formats support - Mime Magic and Parser

2019-12-07 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16990558#comment-16990558 ] Nicholas DiPiazza commented on TIKA-2224: - [~tallison] and team here is the pull r

[jira] [Commented] (TIKA-2224) OneNote formats support - Mime Magic and Parser

2019-12-10 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16993145#comment-16993145 ] Nicholas DiPiazza commented on TIKA-2224: - sounds good! i'll start digging for tho

[jira] [Commented] (TIKA-2224) OneNote formats support - Mime Magic and Parser

2019-12-12 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994694#comment-16994694 ] Nicholas DiPiazza commented on TIKA-2224: - ok i added that and tim merged it. also

[jira] [Commented] (TIKA-2224) OneNote formats support - Mime Magic and Parser

2019-12-12 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994738#comment-16994738 ] Nicholas DiPiazza commented on TIKA-2224: - sure I'll do that later today. > OneN

[jira] [Created] (TIKA-3077) OneNote parser - very inefficient when parsing OneNote <= 2007 files

2020-03-24 Thread Nicholas DiPiazza (Jira)
Nicholas DiPiazza created TIKA-3077: --- Summary: OneNote parser - very inefficient when parsing OneNote <= 2007 files Key: TIKA-3077 URL: https://issues.apache.org/jira/browse/TIKA-3077 Project: Tika

[jira] [Issue Comment Deleted] (TIKA-3077) OneNote parser - very inefficient when parsing OneNote <= 2007 files

2020-03-24 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-3077: Comment: was deleted (was: addressing this in https://github.com/apache/tika/pull/314) > On

[jira] [Commented] (TIKA-3077) OneNote parser - very inefficient when parsing OneNote <= 2007 files

2020-03-24 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066318#comment-17066318 ] Nicholas DiPiazza commented on TIKA-3077: - addressing this in https://github.com/a

[jira] [Created] (TIKA-3125) rmeta/text and unpack - the __DATA__ file and X-TIKA:content differ by some leading new line characters

2020-06-27 Thread Nicholas DiPiazza (Jira)
Nicholas DiPiazza created TIKA-3125: --- Summary: rmeta/text and unpack - the __DATA__ file and X-TIKA:content differ by some leading new line characters Key: TIKA-3125 URL: https://issues.apache.org/jira/browse/TI

[jira] [Updated] (TIKA-3125) rmeta/text and unpack - the __DATA__ file and X-TIKA:content differ by some leading new line characters

2020-06-27 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-3125: Attachment: test-ooxml.docx > rmeta/text and unpack - the __DATA__ file and X-TIKA:content d

[jira] [Updated] (TIKA-3125) rmeta/text and unpack - the __TEXT__ file and X-TIKA:content differ by some leading new line characters

2020-06-27 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-3125: Summary: rmeta/text and unpack - the __TEXT__ file and X-TIKA:content differ by some leading

[jira] [Updated] (TIKA-3125) rmeta/text and unpack - the __TEXT__ file and X-TIKA:content differ by some leading new line characters

2020-06-27 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-3125: Description: Using the attached docx file, when I parse it with {{/unpack}} Endpoint I get

[jira] [Updated] (TIKA-3125) rmeta/text and unpack - the __TEXT__ file and X-TIKA:content differ by some leading new line characters

2020-06-27 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-3125: Description: Using the attached docx file, when I parse it with {{/unpack}} Endpoint I get

[jira] [Updated] (TIKA-3125) rmeta/text and unpack - the __DATA__ file and X-TIKA:content differ by some leading new line characters

2020-06-27 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-3125: Description: Using the attached docx file, when I parse it with {{/unpack}} Endpoint I get

[jira] [Updated] (TIKA-3125) rmeta/text and unpack - the __TEXT__ file and X-TIKA:content differ by some leading new line characters

2020-06-27 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-3125: Description: Using the attached docx file, when I parse it with {{/unpack}} Endpoint I get

[jira] [Updated] (TIKA-3125) rmeta/text and unpack - the __TEXT__ file and X-TIKA:content differ by some leading new line characters

2020-06-27 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-3125: Description: Using the attached docx file, when I parse it with {{/unpack}} Endpoint I get

[jira] [Commented] (TIKA-3126) Consider new endpoint (metadata + content non recursive)

2020-07-09 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17154597#comment-17154597 ] Nicholas DiPiazza commented on TIKA-3126: - [~tallison] correct. that matches my in

[jira] [Comment Edited] (TIKA-3126) Consider new endpoint (metadata + content non recursive)

2020-07-09 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17154597#comment-17154597 ] Nicholas DiPiazza edited comment on TIKA-3126 at 7/9/20, 2:15 PM: --

[jira] [Created] (TIKA-3129) Tika server - track a "last parsed on" timestamp and provide an endpoint to get it

2020-07-09 Thread Nicholas DiPiazza (Jira)
Nicholas DiPiazza created TIKA-3129: --- Summary: Tika server - track a "last parsed on" timestamp and provide an endpoint to get it Key: TIKA-3129 URL: https://issues.apache.org/jira/browse/TIKA-3129

[jira] [Updated] (TIKA-3133) /rmeta endpoint should not hard code writeLimit and maxEmbeddedResources

2020-07-14 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-3133: Issue Type: Improvement (was: Wish) > /rmeta endpoint should not hard code writeLimit and m

[jira] [Created] (TIKA-3133) /rmeta endpoint should not hard code writeLimit and maxEmbeddedResources

2020-07-14 Thread Nicholas DiPiazza (Jira)
Nicholas DiPiazza created TIKA-3133: --- Summary: /rmeta endpoint should not hard code writeLimit and maxEmbeddedResources Key: TIKA-3133 URL: https://issues.apache.org/jira/browse/TIKA-3133 Project: T

[jira] [Commented] (TIKA-3129) Tika server - track a "last parsed on" timestamp and provide an endpoint to get it

2020-08-11 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175529#comment-17175529 ] Nicholas DiPiazza commented on TIKA-3129: - [~tallison] I will be testing out this

[jira] [Created] (TIKA-3173) Tika server with spawnChild - server does not recovery from OOM until an additional file comes in

2020-08-17 Thread Nicholas DiPiazza (Jira)
Nicholas DiPiazza created TIKA-3173: --- Summary: Tika server with spawnChild - server does not recovery from OOM until an additional file comes in Key: TIKA-3173 URL: https://issues.apache.org/jira/browse/TIKA-317

[jira] [Updated] (TIKA-3173) Tika server with spawnChild - server does not recover from OOM until an additional file comes in

2020-08-17 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-3173: Summary: Tika server with spawnChild - server does not recover from OOM until an additional

[jira] [Commented] (TIKA-3173) Tika server with spawnChild - server does not recover from OOM until an additional file comes in

2020-08-18 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17180097#comment-17180097 ] Nicholas DiPiazza commented on TIKA-3173: - Between 54:36 and 57:00 the client side

[jira] [Comment Edited] (TIKA-3173) Tika server with spawnChild - server does not recover from OOM until an additional file comes in

2020-08-18 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17180097#comment-17180097 ] Nicholas DiPiazza edited comment on TIKA-3173 at 8/18/20, 8:54 PM: -

[jira] [Comment Edited] (TIKA-3173) Tika server with spawnChild - server does not recover from OOM until an additional file comes in

2020-08-18 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17180097#comment-17180097 ] Nicholas DiPiazza edited comment on TIKA-3173 at 8/18/20, 8:56 PM: -

  1   2   3   >