kbachuHighSpot commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2221002267
Will do. Thank you for the help.
With the above `commons-io` suggestion, everything looks good now. Will be
doing more testing.
--
This is an automated message from the Apach
THausherr commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2219959205
This is really getting off-topic, please post to the tika users mailing list
(don't forget to subscribe)
https://lists.apache.org/list.html?u...@tika.apache.orgsee bottom left
or
THausherr commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2219468516
It worked for me with small changes because your code isn't runnable:
```
Path input = Paths.get("samplepptx.pptx");
Writer writer = new OutputStreamWriter(Syste
kbachuHighSpot commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2218995629
Thank you. That worked but I bumped into a new issue now after working
through few other huccups.
I am trying to parse a ppt file.
```
import org.apache.tika.io.TikaInput
THausherr commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2216338193
Try `TikaCoreProperties` instead.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specif
kbachuHighSpot commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2216056172
I had to revert to `3.0.0-BETA` instead of `3.0.0-SNAPSHOT` due to
dependencies in our code.
Running into this issue when I use `BETA` version.
```
[2024-07-08T22
kbachuHighSpot commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2215277124
Thank you @tballison . This worked. However, we have a lot of dependencies
on the version to be a release.
Any idea when new TIKA version be released? just so that we can put it in
THausherr commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2214741939
It's not in maven central. Add this to your pom.xml
```
id1
https://repository.apache.org/snapshots/
kbachuHighSpot commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2214690992
But which repo should I point to in `.pom` file?
I tried using 3.0.0-SNAPSHOT or 3.0.0 in .pom file but can't find it.
```
Could not find artifact org.apache.tika:tika-
THausherr commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2212942831
There's been plans to do another alpha soon. Snapshots are here:
https://repository.apache.org/content/groups/snapshots/org/apache/tika/tika-app/3.0.0-SNAPSHOT/
--
This is an automate
kbachuHighSpot commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2212556752
Thank you @THausherr.
These errors are occuring for a bad pdf file. Even with these errors
ignoring and repairing, we are able to process it fine now.
Any idea when we plan t
THausherr commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2211609822
There has been a complaint about the NISC18030.ttf font in the past:
PDFBOX-5743, and I can see in the browser that I searched for it and found it
at https://github.com/justrajdeep/fonts/b
kbachuHighSpot commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2211353100
yeah, I see. There are more of these. Since `ioexception` is thrown, we are
failing. Is there anything I can do to avoid these errors? why are they
occuring?
```
23:32:5
THausherr commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2207940611
Please look at the rest of the log output. IIRC this is a problem with
`lastresortfont.otf` when the initial scanning is done. But that font is
skipped and life continues.
--
This is an
kbachuHighSpot commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2207526085
I was able to use tika `3.0.0-BETA` and the `pdfbox` is at `3.0.2`.
Seeing this issue - any ideas? am I missing anything?
```
java.io.IOException: Invalid character cod
kbachuHighSpot commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2207307712
I tried using `3.0.0-SNAPSHOT` in `.pom` file but can't find it.
```
Could not find artifact org.apache.tika:tika-core:jar:3.0.0 in central
(https://repo1.maven.org/maven2/)
tballison commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2207177307
If you can pull from the Apache snapshots repo, you can grab it from there?
https://repository.apache.org/content/groups/snapshots/org/apache/tika/tika-parsers-standard-package/3.0.0
kbachuHighSpot commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2207164473
Sorry, I cannot use 3.0.0-BETA, I can only use 2.9.2.
```
org.apache.tika
tika-parsers-standard-package
2.9.2
```
Is there a way to use l
tballison commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2207132252
Y, it should have been in 3.0.0-BETA. How are you using it?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use th
kbachuHighSpot commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2207113861
This is great. Thanks for working on this.
Is this released with 3.0.0 thats in Beta? Because its listed here -
https://tika.apache.org/3.0.0-BETA/index.html
We are blocked o
tballison commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2098906252
I just asked on our dev list. I'd like to get 3.x out soon. We need a beta2
release, though, I think.
--
This is an automated message from the Apache Git Service.
To respond to the messa
dsvensson commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2098585131
@tballison Will this be backported to Tika 2.x, or if not, how far off is
Tika 3.x?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on
danielstravito commented on PR #1473:
URL: https://github.com/apache/tika/pull/1473#issuecomment-2098582675
@tballison Will this be backported to Tika 2.x, or if not, how far off is
Tika 3.x?
--
This is an automated message from the Apache Git Service.
To respond to the message, please lo
tballison merged PR #1473:
URL: https://github.com/apache/tika/pull/1473
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org
24 matches
Mail list logo