uschindler opened a new pull request #318:
URL: https://github.com/apache/tika/pull/318
This hides all warnings caused by commons-io not used in all modules (cf.
new mojo parameter).
See
https://github.com/policeman-tools/forbidden-apis/wiki/Changes#version-30-released-2020-04-27
jusu opened a new pull request #319:
URL: https://github.com/apache/tika/pull/319
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to th
jusu commented on pull request #319:
URL: https://github.com/apache/tika/pull/319#issuecomment-620215553
Ignore this
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
KranthiGV commented on a change in pull request #317:
URL: https://github.com/apache/tika/pull/317#discussion_r434687071
##
File path:
tika-parsers/src/main/java/org/apache/tika/parser/csv/TextAndCSVParser.java
##
@@ -306,7 +306,6 @@ private CSVParams getOverride(Metadata meta
pszemus opened a new pull request #320:
URL: https://github.com/apache/tika/pull/320
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
deathy opened a new pull request #321:
URL: https://github.com/apache/tika/pull/321
adds handling of superscript/subscript in Word parsers as described in
TIKA-3008
This is an automated message from the Apache Git Service.
T
matthewford opened a new pull request #322:
URL: https://github.com/apache/tika/pull/322
The auto option exists but is not documented
This is an automated message from the Apache Git Service.
To respond to the message, please
tballison merged pull request #322:
URL: https://github.com/apache/tika/pull/322
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
tballison merged pull request #278:
URL: https://github.com/apache/tika/pull/278
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
tballison commented on pull request #278:
URL: https://github.com/apache/tika/pull/278#issuecomment-644886130
@makepanic I'm sorry this took forever. We had to do some unpleasant
shimming to upgrade drewnoakes' metadata extractor. We've done this now, and
this _should_ just work now. TH
tballison merged pull request #320:
URL: https://github.com/apache/tika/pull/320
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
tballison merged pull request #276:
URL: https://github.com/apache/tika/pull/276
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
tballison merged pull request #272:
URL: https://github.com/apache/tika/pull/272
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
nddipiazza opened a new pull request #323:
URL: https://github.com/apache/tika/pull/323
parameters so that this can be customized.
This is an automated message from the Apache Git Service.
To respond to the message, please lo
nddipiazza commented on pull request #315:
URL: https://github.com/apache/tika/pull/315#issuecomment-653789371
@lewismc @tballison What do you think about swagger?
I want to take what Lewis did here and introduce swagger-annotations +
swagger-jaxrs. This would remove the need for the o
nddipiazza edited a comment on pull request #315:
URL: https://github.com/apache/tika/pull/315#issuecomment-653789371
@lewismc @tballison What do you think about swagger?
I want to take what Lewis did here and introduce swagger-annotations +
swagger-jaxrs. This would remove the need fo
nddipiazza edited a comment on pull request #315:
URL: https://github.com/apache/tika/pull/315#issuecomment-653789371
@lewismc @tballison What do you think about swagger?
I want to take what Lewis did here and introduce swagger-annotations +
swagger-jaxrs. This would remove the need fo
nddipiazza edited a comment on pull request #315:
URL: https://github.com/apache/tika/pull/315#issuecomment-653789371
@lewismc @tballison What do you think about swagger?
I want to take what Lewis did here and introduce swagger-annotations +
swagger-jaxrs. This would remove the need fo
nddipiazza edited a comment on pull request #315:
URL: https://github.com/apache/tika/pull/315#issuecomment-653789371
@lewismc @tballison What do you think about swagger?
I want to take what Lewis did here and put the documentation within
swagger-annotations + swagger-jaxrs. This would
lewismc commented on pull request #315:
URL: https://github.com/apache/tika/pull/315#issuecomment-653809226
Hi Nicholas, this work is nearly completed. We will update within the week.
We can review then... thank you for your interest.
On Sat, Jul 4, 2020 at 10:04 Nicholas DiPia
nddipiazza commented on pull request #315:
URL: https://github.com/apache/tika/pull/315#issuecomment-653914555
@lewismc cool! do you mean the openapi yaml work you have in this PR? or do
you mean swagger implementation?
Thi
lewismc commented on pull request #315:
URL: https://github.com/apache/tika/pull/315#issuecomment-653934091
Both the OpenAPI and the implementation.
We will be delivering the jaxrs generated project with the existing tika
server implementation ported over.
On Sun, Jul 5, 2020 at
nddipiazza commented on pull request #323:
URL: https://github.com/apache/tika/pull/323#issuecomment-655851164
@tballison just dropping you a ping to see if you get a chance to review
this one.
This is an automated message
michaelwda opened a new pull request #324:
URL: https://github.com/apache/tika/pull/324
See https://issues.apache.org/jira/browse/TIKA-1570
Add a stop method that will shutdown the watchdog process and terminate the
JVM. This is useful for Apache Commons Daemon, allowing a user to de
clarkperkins opened a new pull request #325:
URL: https://github.com/apache/tika/pull/325
…olerance to match PDFBox defaults
This is an automated message from the Apache Git Service.
To respond to the message, please log on t
nddipiazza closed pull request #323:
URL: https://github.com/apache/tika/pull/323
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to th
nddipiazza commented on pull request #323:
URL: https://github.com/apache/tika/pull/323#issuecomment-658233475
closing - re-opening in a new jira specifically for adding these two headers
TIKA-3133
This is an automated messa
nddipiazza opened a new pull request #326:
URL: https://github.com/apache/tika/pull/326
see https://issues.apache.org/jira/browse/TIKA-3133
and https://issues.apache.org/jira/browse/TIKA-3126
this will add new parameters to `rmeta` rest endpoint
`writeLimit` - max number of
nddipiazza commented on pull request #326:
URL: https://github.com/apache/tika/pull/326#issuecomment-658236941
@tballison i think this can be merged now. I disassociated it with TIKA-3126
so that this PR can be 100% focused on not hard coding those values.
---
tothd91 opened a new pull request #327:
URL: https://github.com/apache/tika/pull/327
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
tballison merged pull request #326:
URL: https://github.com/apache/tika/pull/326
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
tballison commented on pull request #327:
URL: https://github.com/apache/tika/pull/327#issuecomment-658905767
@tothd91 thank you for opening this! It looks like there are quite a few
changes that are white-space only. Would it be possible to update so that the
diff includes only logic di
tballison commented on pull request #325:
URL: https://github.com/apache/tika/pull/325#issuecomment-658951531
Thank you @clarkperkins !
This is an automated message from the Apache Git Service.
To respond to the message, plea
tballison merged pull request #325:
URL: https://github.com/apache/tika/pull/325
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
jendabenda opened a new pull request #328:
URL: https://github.com/apache/tika/pull/328
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
tballison merged pull request #328:
URL: https://github.com/apache/tika/pull/328
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
tballison opened a new pull request #329:
URL: https://github.com/apache/tika/pull/329
This should work once TIKA-3137 is merged
This is an automated message from the Apache Git Service.
To respond to the message, please log
tballison merged pull request #329:
URL: https://github.com/apache/tika/pull/329
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
PeterAlfredLee opened a new pull request #330:
URL: https://github.com/apache/tika/pull/330
Some bin files' url was updated.
This is an automated message from the Apache Git Service.
To respond to the message, please log on t
PeterAlfredLee opened a new pull request #331:
URL: https://github.com/apache/tika/pull/331
Some test's assert expect language is english(e.g.
org.apache.tika.parser.sas.SAS7BDAParserTest), these test will fail when jvm's
default language is not en.
This is a fix to set jvm's default la
PeterAlfredLee opened a new pull request #332:
URL: https://github.com/apache/tika/pull/332
TestCase `org.apache.tika.image.HeifParserTest.testSimple` failed in windows
because `TemporaryResources.close()` sometimes fail to delete tmp file.
We can make it `deleteOnExit` as it's only
PeterAlfredLee opened a new pull request #333:
URL: https://github.com/apache/tika/pull/333
Adds github action CI builds on Ubuntu
This is an automated message from the Apache Git Service.
To respond to the message, please lo
THausherr commented on pull request #332:
URL: https://github.com/apache/tika/pull/332#issuecomment-660511155
Isn't this moot?
https://issues.apache.org/jira/browse/TIKA-3135
This is an automated message from the Apache Gi
tothd91 commented on pull request #327:
URL: https://github.com/apache/tika/pull/327#issuecomment-660851533
Hello @tballison i did it. I hope it's ok now.
This is an automated message from the Apache Git Service.
To respond
PeterAlfredLee opened a new pull request #334:
URL: https://github.com/apache/tika/pull/334
Trying to fix Tika-3141 with a empty string check in `TikaConfig`
This is an automated message from the Apache Git Service.
To respon
PeterAlfredLee commented on pull request #332:
URL: https://github.com/apache/tika/pull/332#issuecomment-666331912
Hi @THausherr , sorry for the late reply.
I think the fix in
[TIKA-3135](https://issues.apache.org/jira/browse/TIKA-3135) is trying to avoid
occupying the file, therefore w
keithrbennett commented on a change in pull request #334:
URL: https://github.com/apache/tika/pull/334#discussion_r463103079
##
File path: tika-core/src/main/java/org/apache/tika/config/TikaConfig.java
##
@@ -249,11 +249,11 @@ public TikaConfig(ClassLoader loader)
public T
THausherr commented on pull request #332:
URL: https://github.com/apache/tika/pull/332#issuecomment-04910
I agree that it shouldn't stop the process. Suggestion: output a log
message, because the cause is usually a programming oversight, so that it can
be reported and fixed.
THausherr edited a comment on pull request #332:
URL: https://github.com/apache/tika/pull/332#issuecomment-04910
I agree that it shouldn't stop the process. Suggestion: also output a log
message, because the cause is usually a programming oversight, so that it can
be reported and fixed
PeterAlfredLee commented on a change in pull request #334:
URL: https://github.com/apache/tika/pull/334#discussion_r463905876
##
File path: tika-core/src/main/java/org/apache/tika/config/TikaConfig.java
##
@@ -249,11 +249,11 @@ public TikaConfig(ClassLoader loader)
public
PeterAlfredLee commented on pull request #332:
URL: https://github.com/apache/tika/pull/332#issuecomment-667450631
> Suggestion: also output a log message, because the cause is usually a
programming oversight, so that it can be reported and fixed.
Just pushed the logging part. :)
-
PeterAlfredLee commented on a change in pull request #334:
URL: https://github.com/apache/tika/pull/334#discussion_r463906635
##
File path: tika-core/src/main/java/org/apache/tika/config/TikaConfig.java
##
@@ -249,11 +249,11 @@ public TikaConfig(ClassLoader loader)
public
keithrbennett commented on a change in pull request #334:
URL: https://github.com/apache/tika/pull/334#discussion_r463971330
##
File path: tika-core/src/main/java/org/apache/tika/config/TikaConfig.java
##
@@ -249,11 +249,11 @@ public TikaConfig(ClassLoader loader)
public T
keithrbennett commented on a change in pull request #334:
URL: https://github.com/apache/tika/pull/334#discussion_r463971330
##
File path: tika-core/src/main/java/org/apache/tika/config/TikaConfig.java
##
@@ -249,11 +249,11 @@ public TikaConfig(ClassLoader loader)
public T
jendabenda opened a new pull request #335:
URL: https://github.com/apache/tika/pull/335
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
PeterAlfredLee commented on a change in pull request #334:
URL: https://github.com/apache/tika/pull/334#discussion_r464218911
##
File path: tika-core/src/main/java/org/apache/tika/config/TikaConfig.java
##
@@ -249,11 +249,11 @@ public TikaConfig(ClassLoader loader)
public
PeterAlfredLee opened a new pull request #336:
URL: https://github.com/apache/tika/pull/336
According to these web pages: [Windows-1252 Chraracter
list](https://www.fileformat.info/info/charset/windows-1252/list.htm) ,
[ISO-8859-1 Chraracter
list](http://www.fileformat.info/info/charset/I
JoaoGFarias opened a new pull request #337:
URL: https://github.com/apache/tika/pull/337
prooblem => problem
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
kkrugler merged pull request #337:
URL: https://github.com/apache/tika/pull/337
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
kkrugler commented on pull request #337:
URL: https://github.com/apache/tika/pull/337#issuecomment-671978803
Thanks João!
This is an automated message from the Apache Git Service.
To respond to the message, please log on to G
PeterAlfredLee opened a new pull request #338:
URL: https://github.com/apache/tika/pull/338
Seems we can use `charsetdetector.StandardHtmlEncodingDetector` for charset
detecting of HTML. I'm wondering why we are not using it?
And I stopped treating ISO-8859-1 as Windows-1252.
-
tballison commented on pull request #338:
URL: https://github.com/apache/tika/pull/338#issuecomment-673508020
Inertia... I never got around to doing a bakeoff between the two, and,
unless there's evidence of improvement, I'm hesitant to make the change as the
default detector.
--
PeterAlfredLee commented on pull request #338:
URL: https://github.com/apache/tika/pull/338#issuecomment-673823280
Like [TIKA-2421](https://issues.apache.org/jira/browse/TIKA-2421) says ,
according to [w3
description](https://www.w3.org/International/questions/qa-html-encoding-declaration
PeterAlfredLee opened a new pull request #339:
URL: https://github.com/apache/tika/pull/339
[TIKA-2001](https://issues.apache.org/jira/browse/TIKA-2001) requires a XML
parser which can output text and attributes.
This PR would like to provide it.
User can config `TextAndAttributeXMLPa
PeterAlfredLee opened a new pull request #340:
URL: https://github.com/apache/tika/pull/340
1. fix parse arg "--client="
2. make the way of parse arg "--compare-file-magic" same as others
This is an automated message from
tballison opened a new pull request #341:
URL: https://github.com/apache/tika/pull/341
TIKA-3166 Tika 2.0.0 that builds at least locally...
This is an automated message from the Apache Git Service.
To respond to the message,
tballison merged pull request #341:
URL: https://github.com/apache/tika/pull/341
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
PeterAlfredLee opened a new pull request #342:
URL: https://github.com/apache/tika/pull/342
1.remove unnecessary import
2.reset outContent and errContent if they are not empty,prevent previous
TikaCLI.main run output left.
3.add two test case
4.modify previous test case,use method
PeterAlfredLee opened a new pull request #343:
URL: https://github.com/apache/tika/pull/343
CTAKESParser should not load via the parser service loader because it will
cause an infinite loop.
If `org.apache.tika.parser.ctakes.CTAKESParser` in file
`org.apache.tika.parser.Parser`:
asfgit merged pull request #343:
URL: https://github.com/apache/tika/pull/343
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the sp
bobpaulin opened a new pull request #344:
URL: https://github.com/apache/tika/pull/344
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
PeterAlfredLee opened a new pull request #346:
URL: https://github.com/apache/tika/pull/346
Method `fillSet` in `OPCPackageDetector` can be more simple.
This is an automated message from the Apache Git Service.
To respond to
PeterAlfredLee opened a new pull request #345:
URL: https://github.com/apache/tika/pull/345
Use const which in class `PackageRelationshipTypes`.
Just like the `TODO` says.
This is an automated message from the Apache Git S
PeterAlfredLee opened a new pull request #347:
URL: https://github.com/apache/tika/pull/347
We want to know if string "reporter" it's keyNodes's key instead of if null
it's keyNodes's key.
So I think this a typo and this PR is a fix for it.
-
PeterAlfredLee opened a new pull request #348:
URL: https://github.com/apache/tika/pull/348
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above t
kkrugler commented on a change in pull request #348:
URL: https://github.com/apache/tika/pull/348#discussion_r479656172
##
File path: tika-core/src/main/java/org/apache/tika/io/CountingInputStream.java
##
@@ -56,7 +56,7 @@ public CountingInputStream(InputStream in) {
@Over
PeterAlfredLee commented on a change in pull request #348:
URL: https://github.com/apache/tika/pull/348#discussion_r479852298
##
File path: tika-core/src/main/java/org/apache/tika/io/CountingInputStream.java
##
@@ -56,7 +56,7 @@ public CountingInputStream(InputStream in) {
kkrugler merged pull request #348:
URL: https://github.com/apache/tika/pull/348
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
kkrugler commented on pull request #348:
URL: https://github.com/apache/tika/pull/348#issuecomment-683516522
Thanks Peter!
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
tballison opened a new pull request #349:
URL: https://github.com/apache/tika/pull/349
Refactor parser modules for three classes of parsers: basic, extended,
advanced
This is an automated message from the Apache Git Service.
tballison merged pull request #349:
URL: https://github.com/apache/tika/pull/349
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
schmitch opened a new pull request #1:
URL: https://github.com/apache/tika-docker/pull/1
this also enables people to specify additional params, which is not possible
in the current form:
https://docs.docker.com/engine/reference/builder/#entrypoint
> The shell form prevents any CMD
PeterAlfredLee opened a new pull request #350:
URL: https://github.com/apache/tika/pull/350
Hi all,
I noticed that the main branch succeed on excuting command `mvn clean
install` now, but failed in `IDEA` .
After some debugging, I found what's wrong:
1. Component tika-parsers and
PeterAlfredLee commented on pull request #342:
URL: https://github.com/apache/tika/pull/342#issuecomment-686311831
Update : 5.simplify some code in method testExtract, testExtractTgz,
testExtractInlineImages.
This is an aut
tballison commented on pull request #350:
URL: https://github.com/apache/tika/pull/350#issuecomment-686565825
This is a great catch. Thank you!
I'm really frustrated that I didn't catch this locally and also that the ci
didn't catch it because of the snapshot repo.
I had in
tballison commented on pull request #350:
URL: https://github.com/apache/tika/pull/350#issuecomment-686602942
I just pushed a commit that should fix this. I was able to get a clean
build after deleting my local tika repo and running the build offline.
I use Intellij, too, and I have
tballison edited a comment on pull request #350:
URL: https://github.com/apache/tika/pull/350#issuecomment-686602942
I just pushed a commit that should fix this. I was able to get a clean
build after deleting my local tika repo and running the build offline.
I use Intellij, too, and
tballison merged pull request #342:
URL: https://github.com/apache/tika/pull/342
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
tballison commented on pull request #338:
URL: https://github.com/apache/tika/pull/338#issuecomment-686610377
Wait, it turns out I did get around to doing this study...
https://github.com/tballison/share/blob/main/slides/Tika_charset_detector_study_201909.docx
Let me read it a
PeterAlfredLee commented on pull request #350:
URL: https://github.com/apache/tika/pull/350#issuecomment-686860503
Hi @tballison
> I use Intellij, too, and I have had to run find . -name *.iml -exec rm -rf
{} \; && rm -r .idea/ a number of times during development of Tika 2.0 becaus
tballison commented on pull request #350:
URL: https://github.com/apache/tika/pull/350#issuecomment-687141437
Oooo...fun... thank you!
Is the build working for you now?
This is an automated message from the Apache Git
PeterAlfredLee closed pull request #350:
URL: https://github.com/apache/tika/pull/350
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go t
PeterAlfredLee commented on pull request #350:
URL: https://github.com/apache/tika/pull/350#issuecomment-687511400
Yes, It works good ! Thank you !
This is an automated message from the Apache Git Service.
To respond to the m
PeterAlfredLee opened a new pull request #351:
URL: https://github.com/apache/tika/pull/351
1.modify method tearDown: If delete output directory root fail,try delete on
exit.
2.simplify some code use collection.addAll
3.simplify some code use "for"
4.remove unnecessary import
---
PeterAlfredLee opened a new pull request #352:
URL: https://github.com/apache/tika/pull/352
As `TODO` says , should check the % format again after
https://github.com/epam/parso/issues/28 fixed.
Modify `table` use regular expression matching because `testXLS`, `testXLSX`
and `testXLS
tballison merged pull request #344:
URL: https://github.com/apache/tika/pull/344
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
PeterAlfredLee opened a new pull request #353:
URL: https://github.com/apache/tika/pull/353
Use short months in default language to test.
This is an automated message from the Apache Git Service.
To respond to the message, pl
PeterAlfredLee commented on pull request #353:
URL: https://github.com/apache/tika/pull/353#issuecomment-689243546
This is another fix implemention like PR
[#331](https://github.com/apache/tika/pull/331), so that we do not need to
modify the jvm args like #331 did.
Only eithor one o
PeterAlfredLee opened a new pull request #354:
URL: https://github.com/apache/tika/pull/354
The `tika-parsers` is not generating `test-jar` with the recent commit
`a504d7e`, but `tika-server` and `tika-examples` are still using it as
dependency. This would lead to the building failure, and
tballison merged pull request #354:
URL: https://github.com/apache/tika/pull/354
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
1 - 100 of 1810 matches
Mail list logo