Luca Bentivoglio created TIKA-4257:
--
Summary: Riconoscimento file p7m
Key: TIKA-4257
URL: https://issues.apache.org/jira/browse/TIKA-4257
Project: Tika
Issue Type: Bug
Components:
[
https://issues.apache.org/jira/browse/TIKA-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Luca Bentivoglio updated TIKA-4257:
---
Summary: Tika detect riconosce alcuni file p7m come formato x-dbf (was:
Riconoscimento file p
[
https://issues.apache.org/jira/browse/TIKA-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Luca Bentivoglio updated TIKA-4257:
---
Summary: Tika detect() riconosce alcuni file p7m come formato x-dbf (was:
Tika detect riconos
THausherr commented on PR #1771:
URL: https://github.com/apache/tika/pull/1771#issuecomment-2119943391
@dependabot rebase
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
THausherr commented on PR #1768:
URL: https://github.com/apache/tika/pull/1768#issuecomment-2119943796
@dependabot rebase
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
THausherr commented on PR #1771:
URL: https://github.com/apache/tika/pull/1771#issuecomment-2119943554
@dependabot rebase
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
THausherr commented on PR #1770:
URL: https://github.com/apache/tika/pull/1770#issuecomment-2119943684
@dependabot rebase
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
[
https://issues.apache.org/jira/browse/TIKA-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Luca Bentivoglio updated TIKA-4257:
---
Summary: Tika detect() recognizes some p7m files as format x-dbf (was:
Tika detect() riconosc
THausherr commented on PR #1765:
URL: https://github.com/apache/tika/pull/1765#issuecomment-2119951689
@dependabot rebase
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
[
https://issues.apache.org/jira/browse/TIKA-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Luca Bentivoglio updated TIKA-4257:
---
Description:
Tika detect method sometimes recognizes p7m files as format x-dbf.
In the attach
[
https://issues.apache.org/jira/browse/TIKA-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Luca Bentivoglio updated TIKA-4257:
---
Description:
Tika detect method sometimes recognizes p7m files as format x-dbf.
In the attach
[
https://issues.apache.org/jira/browse/TIKA-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847827#comment-17847827
]
Hudson commented on TIKA-4166:
--
ABORTED: Integrated in Jenkins build Tika » tika-main-jdk11 #
THausherr merged PR #1765:
URL: https://github.com/apache/tika/pull/1765
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org
THausherr merged PR #1771:
URL: https://github.com/apache/tika/pull/1771
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org
THausherr merged PR #1768:
URL: https://github.com/apache/tika/pull/1768
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org
THausherr merged PR #1770:
URL: https://github.com/apache/tika/pull/1770
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org
nextgens commented on PR #19:
URL: https://github.com/apache/tika-docker/pull/19#issuecomment-2120096349
Any chance we could have multi-arch for 2.9.2 ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
[
https://issues.apache.org/jira/browse/TIKA-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847856#comment-17847856
]
Hudson commented on TIKA-4166:
--
UNSTABLE: Integrated in Jenkins build Tika » tika-main-jdk11
[
https://issues.apache.org/jira/browse/TIKA-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847868#comment-17847868
]
Hudson commented on TIKA-4166:
--
SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #
tballison merged PR #1762:
URL: https://github.com/apache/tika/pull/1762
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org
[
https://issues.apache.org/jira/browse/TIKA-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847874#comment-17847874
]
ASF GitHub Bot commented on TIKA-4256:
--
tballison merged PR #1762:
URL: https://githu
tballison commented on PR #19:
URL: https://github.com/apache/tika-docker/pull/19#issuecomment-2120433486
@nextgens , I don't have enough knowledge to move forward on this PR alone.
If there's a simpler way to achieve multi-arch with fewer mods to our current
process, I'd be more than happy
stumpylog commented on PR #19:
URL: https://github.com/apache/tika-docker/pull/19#issuecomment-2120440136
If be happy to help with reviewing it, but I don't see there is a simpler
way. Using actions is the best way to achieve this end goal.
--
This is an automated message from the Apache
tballison commented on PR #19:
URL: https://github.com/apache/tika-docker/pull/19#issuecomment-2120444104
Let me ping infra at asf to see what we need to do to get this working as an
action. I _think_ that's the blocker for me.
--
This is an automated message from the Apache Git Service.
tballison commented on PR #19:
URL: https://github.com/apache/tika-docker/pull/19#issuecomment-2120452143
Pinged asf infra on credentials and how to do this for an asf project.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub a
[
https://issues.apache.org/jira/browse/TIKA-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Luca Bentivoglio updated TIKA-4257:
---
Description:
Tika detect method sometimes recognizes p7m files as format application/x-dbf.
I
tballison commented on PR #19:
URL: https://github.com/apache/tika-docker/pull/19#issuecomment-2120479707
I opened: https://issues.apache.org/jira/browse/TIKA-4258 to track this on
our JIRA. I also opened an issue on infra.
--
This is an automated message from the Apache Git Service.
To r
Tim Allison created TIKA-4258:
-
Summary: Multi-arch support for docker images
Key: TIKA-4258
URL: https://issues.apache.org/jira/browse/TIKA-4258
Project: Tika
Issue Type: Task
Report
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847882#comment-17847882
]
Tim Allison commented on TIKA-4258:
---
If fellow devs with better knowledge of github acti
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847883#comment-17847883
]
Tim Allison commented on TIKA-4258:
---
Helpful links from #infra:
https://infra.apache.or
tballison commented on PR #19:
URL: https://github.com/apache/tika-docker/pull/19#issuecomment-2120501490
It looks like Airflow at least has moved away from github actions and moved
towards a release manager building locally and pushing to dockerhub --
https://cwiki.apache.org/confluence/di
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847884#comment-17847884
]
ASF GitHub Bot commented on TIKA-4258:
--
tballison commented on PR #19:
URL: https://g
nextgens commented on PR #19:
URL: https://github.com/apache/tika-docker/pull/19#issuecomment-2120530390
If securing the credentials required for dockerhub is the only concern, I
think using github container registry instead may be a great solution.
https://docs.github.com/en/packages/wo
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847887#comment-17847887
]
ASF GitHub Bot commented on TIKA-4258:
--
nextgens commented on PR #19:
URL: https://gi
[
https://issues.apache.org/jira/browse/TIKA-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847890#comment-17847890
]
Hudson commented on TIKA-4166:
--
SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #
tballison commented on PR #19:
URL: https://github.com/apache/tika-docker/pull/19#issuecomment-2120574718
How's this for a proposed way forward?
We basically keep our current workflow on the release manager's
laptop/hardware. We modify our build scripts to build a single-arch image, r
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847895#comment-17847895
]
ASF GitHub Bot commented on TIKA-4258:
--
tballison commented on PR #19:
URL: https://g
tballison commented on PR #19:
URL: https://github.com/apache/tika-docker/pull/19#issuecomment-2120577030
> If securing the credentials required for dockerhub is the only concern, I
think using github container registry instead may be a great solution.
https://docs.github.com/en/packages/wo
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847896#comment-17847896
]
ASF GitHub Bot commented on TIKA-4258:
--
tballison commented on PR #19:
URL: https://g
fpiesche commented on PR #19:
URL: https://github.com/apache/tika-docker/pull/19#issuecomment-2120688287
I think building multiarch with buildx requires QEMU, but as long as that's
available on the host doing the builds just running buildx should be perfectly
fine - that's all the github wo
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847905#comment-17847905
]
ASF GitHub Bot commented on TIKA-4258:
--
fpiesche commented on PR #19:
URL: https://gi
[
https://issues.apache.org/jira/browse/TIKA-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847909#comment-17847909
]
Hudson commented on TIKA-4256:
--
SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #
tballison opened a new pull request, #21:
URL: https://github.com/apache/tika-docker/pull/21
(no comment)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-
tballison commented on PR #19:
URL: https://github.com/apache/tika-docker/pull/19#issuecomment-2120807457
Let's add other registries on a later ticket?
How's this look? https://github.com/apache/tika-docker/pull/21
I haven't tested it.
--
This is an automated message from the
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847929#comment-17847929
]
ASF GitHub Bot commented on TIKA-4258:
--
tballison commented on PR #19:
URL: https://g
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847931#comment-17847931
]
Tim Allison commented on TIKA-4243:
---
Separately, but related to this and also to TIKA-42
stumpylog commented on PR #21:
URL: https://github.com/apache/tika-docker/pull/21#issuecomment-2120837966
This works for me, except for the pushing obviously. One minor annoyance,
if the build does fail, the builder will still be around, so it will have to be
manually removed before runnin
tballison commented on PR #19:
URL: https://github.com/apache/tika-docker/pull/19#issuecomment-2120845395
Wow...it looks like it actually worked?!
Can you all give this a shot?
https://hub.docker.com/layers/apache/tika/2.9.2-alpha-multi-arch/images/sha256-b8b6e02e3e9f98ddae33b74881f4e
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847937#comment-17847937
]
ASF GitHub Bot commented on TIKA-4258:
--
tballison commented on PR #19:
URL: https://g
tballison commented on PR #21:
URL: https://github.com/apache/tika-docker/pull/21#issuecomment-2120847590
Got it. Thank you @stumpylog !
How's this:
https://hub.docker.com/layers/apache/tika/2.9.2-alpha-multi-arch/images/sha256-b8b6e02e3e9f98ddae33b74881f4ead7846ee12352d53149098857378
tballison commented on PR #21:
URL: https://github.com/apache/tika-docker/pull/21#issuecomment-2120848940
Building `full` now...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comme
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847943#comment-17847943
]
Tim Allison commented on TIKA-4258:
---
And here's the full version:
https://hub.docker.co
hegerdes commented on PR #19:
URL: https://github.com/apache/tika-docker/pull/19#issuecomment-2120869924
> Wow...it looks like it actually worked?!
>
> Can you all give this a shot?
https://hub.docker.com/layers/apache/tika/2.9.2-alpha-multi-arch/images/sha256-b8b6e02e3e9f98ddae33b748
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847945#comment-17847945
]
ASF GitHub Bot commented on TIKA-4258:
--
hegerdes commented on PR #19:
URL: https://gi
stumpylog commented on PR #21:
URL: https://github.com/apache/tika-docker/pull/21#issuecomment-2120874896
I pulled the image on a Pi (arm64) and had no problems pulling or starting.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to Git
tballison commented on PR #19:
URL: https://github.com/apache/tika-docker/pull/19#issuecomment-2120876458
> I think building multiarch with buildx requires QEMU, but as long as
that's available on the host doing the builds just running buildx should be
perfectly fine - that's all the github
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847947#comment-17847947
]
ASF GitHub Bot commented on TIKA-4258:
--
tballison commented on PR #19:
URL: https://g
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847949#comment-17847949
]
Tim Allison commented on TIKA-4258:
---
Let's give it a day for fellow devs to weigh in. If
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847950#comment-17847950
]
Tim Allison commented on TIKA-4258:
---
I'm sure I'll need to modify the PR when I actually
stumpylog commented on code in PR #21:
URL: https://github.com/apache/tika-docker/pull/21#discussion_r1607058472
##
docker-tool.sh:
##
@@ -17,15 +17,24 @@
# specific language governing permissions and limitations
# under the License.
+stop_and_die() {
+ docker buildx st
stumpylog commented on PR #21:
URL: https://github.com/apache/tika-docker/pull/21#issuecomment-2120927505
Seems good to me. I built it a couple times, and the recent updates makes
the builder clean up better
--
This is an automated message from the Apache Git Service.
To respond to the m
tballison opened a new pull request, #22:
URL: https://github.com/apache/tika-docker/pull/22
(no comment)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-
tballison commented on PR #22:
URL: https://github.com/apache/tika-docker/pull/22#issuecomment-2121108427
It looks like tesseract 5.3.4 had made it into noble and we don't have to
pull from `ppa:alex-p/tesseract-ocr5` any more.
Let me know if there are any objections to this upgrade.
tballison commented on PR #22:
URL: https://github.com/apache/tika-docker/pull/22#issuecomment-2121117348
Looks like `minimal` gets a little smaller and `full` gets a little bigger,
but nothing eye-opening.
--
This is an automated message from the Apache Git Service.
To respond to the mes
tballison opened a new pull request, #1773:
URL: https://github.com/apache/tika/pull/1773
Thanks for your contribution to [Apache Tika](https://tika.apache.org/)!
Your help is appreciated!
Before opening the pull request, please verify that
* there is an open issue on the [
[
https://issues.apache.org/jira/browse/TIKA-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847972#comment-17847972
]
ASF GitHub Bot commented on TIKA-4257:
--
tballison opened a new pull request, #1773:
U
[
https://issues.apache.org/jira/browse/TIKA-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-4256.
---
Fix Version/s: 3.0.0
Resolution: Fixed
> Allow inlining of ocr'd text in container document
> -
[
https://issues.apache.org/jira/browse/TIKA-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847980#comment-17847980
]
Tim Allison commented on TIKA-4255:
---
Thank you for opening this PR. Are you able to add
tballison merged PR #1773:
URL: https://github.com/apache/tika/pull/1773
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org
[
https://issues.apache.org/jira/browse/TIKA-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847981#comment-17847981
]
ASF GitHub Bot commented on TIKA-4257:
--
tballison merged PR #1773:
URL: https://githu
[
https://issues.apache.org/jira/browse/TIKA-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847996#comment-17847996
]
Hudson commented on TIKA-4257:
--
SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #
71 matches
Mail list logo