[
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387863#comment-14387863
]
Hudson commented on TIKA-1423:
--
SUCCESS: Integrated in tika-trunk-jdk1.7 #589 (See
[https://b
[
https://issues.apache.org/jira/browse/TIKA-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387864#comment-14387864
]
Hudson commented on TIKA-1330:
--
SUCCESS: Integrated in tika-trunk-jdk1.7 #589 (See
[https://b
[
https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387865#comment-14387865
]
Hudson commented on TIKA-1511:
--
SUCCESS: Integrated in tika-trunk-jdk1.7 #589 (See
[https://b
[
https://issues.apache.org/jira/browse/TIKA-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387826#comment-14387826
]
Tim Allison commented on TIKA-1585:
---
Done. Let me know if it works before we shutdown 99
Tim Allison created TIKA-1588:
-
Summary: Upgrade to PDFBox 1.8.10 when available
Key: TIKA-1588
URL: https://issues.apache.org/jira/browse/TIKA-1588
Project: Tika
Issue Type: Improvement
I just remembered TIKA-1509 and TIKA-1558 -- testing now for blacklist
functionality through TIKA-1509. If that works, I'll back out TIKA-1558.
Tim, I think you should run govdocs from the RC, in case something changes
between your run and the cut.
Tyler
On Mon, Mar 30, 2015 at 10:17 AM, Allison
[
https://issues.apache.org/jira/browse/TIKA-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387394#comment-14387394
]
Hudson commented on TIKA-1330:
--
SUCCESS: Integrated in tika-trunk-jdk1.7 #588 (See
[https://b
At least, parser should not hang on processing corrupted document. IMHO,
cases with hanging parser code should be considered blocker issue.
Personally I prefer variant with partial result and some meta which says
that document parsing failed somehow. But it can be hard to do.
--
Best regards,
Ko
[
https://issues.apache.org/jira/browse/TIKA-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386963#comment-14386963
]
Hudson commented on TIKA-1581:
--
SUCCESS: Integrated in tika-trunk-jdk1.7 #587 (See
[https://b
Thanks for your help in resolving this issue, Tim.
Commited in r1670125, jenkins build succeed.
--
Best regards,
Konstantin Gribov
пн, 30 марта 2015 г. в 18:56, Allison, Timothy B. :
+1. Thank you, Konstantin!
>
> -Original Message-
> From: Konstantin Gribov [mailto:gros...@gmail.com]
[
https://issues.apache.org/jira/browse/TIKA-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Konstantin Gribov resolved TIKA-1587.
-
Resolution: Fixed
> ForkParser::setJavaCommand should take List
> -
[
https://issues.apache.org/jira/browse/TIKA-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386922#comment-14386922
]
Hudson commented on TIKA-1587:
--
SUCCESS: Integrated in tika-trunk-jdk1.7 #586 (See
[https://b
[
https://issues.apache.org/jira/browse/TIKA-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386906#comment-14386906
]
Tyler Palsulich edited comment on TIKA-1584 at 3/30/15 4:05 PM:
-
[
https://issues.apache.org/jira/browse/TIKA-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386906#comment-14386906
]
Tyler Palsulich commented on TIKA-1584:
---
Yup! The 1.7 release process should start th
[
https://issues.apache.org/jira/browse/TIKA-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386903#comment-14386903
]
Tim Allison commented on TIKA-1584:
---
Community voted to cut release candidate from trunk
[
https://issues.apache.org/jira/browse/TIKA-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386900#comment-14386900
]
Hong-Thai Nguyen commented on TIKA-1581:
And great thank to [~kkrugler] with many i
[
https://issues.apache.org/jira/browse/TIKA-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386899#comment-14386899
]
Rob Tulloh commented on TIKA-1584:
--
Thanks for the quick turn around to fixing this. Expec
+1. Thank you, Konstantin!
-Original Message-
From: Konstantin Gribov [mailto:gros...@gmail.com]
Sent: Monday, March 30, 2015 11:19 AM
To: dev@tika.apache.org
Subject: Re: Broken build because of clirr plugin
I think, simple way would be to keep old methods (and mark them
@Deprecated) t
I think, simple way would be to keep old methods (and mark them
@Deprecated) to avoid build failure. And use new ones internally.
I'll do `mvn verify` before commiting this time. Sorry for inconvenience.
--
Best regards,
Konstantin Gribov
пн, 30 марта 2015 г. в 18:09, Allison, Timothy B. :
> H
[
https://issues.apache.org/jira/browse/TIKA-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Konstantin Gribov reopened TIKA-1587:
-
> ForkParser::setJavaCommand should take List
> ---
How much of an effort would it be to migrate somewhat slowly:
Leave in but deprecate setCommandLine(String ) and String getCommandLine()
Add something like: setCommandLineArr(String[] ) and String[]
getCommandLineArr()?
-Original Message-
From: Konstantin Gribov [mailto:gros...@gmail.
Hi, folks.
I've broken build (by commit r1670105 for TIKA-1587).
Should I revert this commit and change it to preserve old API or add
exclude to clirr plugin configuration?
--
Best regards,
Konstantin Gribov
Backwards compatibility issue found by clirr on TIKA-1587
[INFO] --- clirr-maven-plugin:2.3:check (default) @ tika-core ---
[ERROR] org.apache.tika.fork.ForkParser: Return type of method 'public
java.lang.String getJavaCommand()' has been changed to java.util.List
[ERROR] org.apache.tika.fork.Fo
[
https://issues.apache.org/jira/browse/TIKA-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386765#comment-14386765
]
Hudson commented on TIKA-1584:
--
FAILURE: Integrated in tika-trunk-jdk1.7 #585 (See
[https://b
The Apache Jenkins build system has built tika-trunk-jdk1.7 (build #585)
Status: Failure
Check console output at https://builds.apache.org/job/tika-trunk-jdk1.7/585/ to
view the results.
[
https://issues.apache.org/jira/browse/TIKA-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Konstantin Gribov resolved TIKA-1587.
-
Resolution: Fixed
Fix Version/s: 1.8
Assignee: Konstantin Gribov
Sorry, I f
[
https://issues.apache.org/jira/browse/TIKA-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386746#comment-14386746
]
Konstantin Gribov commented on TIKA-1587:
-
LGTM. Commited with integration test tri
All,
I've made the changes that I had hoped to. Grib pdf exclusion remains for any
takers.
Let me know when I should initiate the run against govdocs1 to see if there are
any surprises on that corpus with Tika 1.8.
Best,
Tim
-Original Message-
From: Allison, Timothy B. [
[
https://issues.apache.org/jira/browse/TIKA-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Konstantin Gribov reassigned TIKA-1587:
---
Assignee: (was: Konstantin Gribov)
> ForkParser::setJavaCommand should take List
>
[
https://issues.apache.org/jira/browse/TIKA-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Oleg Oshmyan updated TIKA-1587:
---
Attachment: TIKA-1587.patch
Here’s a patch that changes the existing getter and setter signatures. Is t
[
https://issues.apache.org/jira/browse/TIKA-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Konstantin Gribov reassigned TIKA-1587:
---
Assignee: Konstantin Gribov
> ForkParser::setJavaCommand should take List
> --
[
https://issues.apache.org/jira/browse/TIKA-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-1584.
---
Resolution: Fixed
Fix Version/s: 1.8
r1670095.
Thank you, [~rtulloh], for raising this issue!
[
https://issues.apache.org/jira/browse/TIKA-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386697#comment-14386697
]
Tim Allison edited comment on TIKA-1512 at 3/30/15 1:54 PM:
Tem
I think this is an open question within Tika. Some parsers prefer one thing
over another. And there are different levels of corruption.
In the two cases where govdocs1 docs might be useful in tests, the hyperlinks
in .doc files do not appear to be "standard", but MSWord opens them without a
[
https://issues.apache.org/jira/browse/TIKA-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386720#comment-14386720
]
Hudson commented on TIKA-1512:
--
SUCCESS: Integrated in tika-trunk-jdk1.7 #584 (See
[https://b
[
https://issues.apache.org/jira/browse/TIKA-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386716#comment-14386716
]
Yauheni Salopiy commented on TIKA-1512:
---
Hi [~talli...@mitre.org],
Thank You very mu
Ah. I see.
In general, what is the goal with handling corrupted files? Extract as much
as possible and fail gracefully?
Tyler
On Mar 30, 2015 9:32 AM, "Allison, Timothy B." wrote:
>
> Unfortunately, no. MSOffice fixes the document when I do that.
>
> -Original Message-
> From: Tyler Pa
Unfortunately, no. MSOffice fixes the document when I do that.
-Original Message-
From: Tyler Palsulich [mailto:tpalsul...@gmail.com]
Sent: Monday, March 30, 2015 9:24 AM
To: dev@tika.apache.org
Subject: Re: including refactored docs from govdocs1 in test suite
Can you copy the hyperlin
[
https://issues.apache.org/jira/browse/TIKA-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386697#comment-14386697
]
Tim Allison commented on TIKA-1512:
---
Temporary fix ignoring tests and excluding test docs
Can you copy the hyperlink into a new doc and change the URL? I have no
idea about including the modified version.
Tyler
On Mar 30, 2015 9:18 AM, "Allison, Timothy B." wrote:
> All,
>
> As part of TIKA-1512, I found that I can delete all of the contents,
> including the metadata, except for on
[
https://issues.apache.org/jira/browse/TIKA-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386685#comment-14386685
]
Tyler Palsulich commented on TIKA-1587:
---
Thank you for reporting this! It seems like
All,
As part of TIKA-1512, I found that I can delete all of the contents,
including the metadata, except for one hyperlink in two documents from govdocs1
and still get the proper behavior -- fail before fix, work after fix.
These documents are in the public domain.
Is it ok to include th
Oleg Oshmyan created TIKA-1587:
--
Summary: ForkParser::setJavaCommand should take List
Key: TIKA-1587
URL: https://issues.apache.org/jira/browse/TIKA-1587
Project: Tika
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/TIKA-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Oleg Oshmyan updated TIKA-1587:
---
Description: ForkParser::setJavaCommand currently takes a string and splits
it on whitespace. This make
[
https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386648#comment-14386648
]
Hudson commented on TIKA-1511:
--
SUCCESS: Integrated in tika-trunk-jdk1.7 #583 (See
[https://b
[
https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-1511.
---
Resolution: Fixed
r1670069. Removed "provided" in parsers' pom. Happy to revisit this if there
are s
[
https://issues.apache.org/jira/browse/TIKA-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14343665#comment-14343665
]
Tim Allison edited comment on TIKA-944 at 3/30/15 11:41 AM:
Some
[
https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386562#comment-14386562
]
Tim Allison commented on TIKA-1511:
---
Thank you, [~thetaphi]. I was aware of about half o
Unless there are objections, I'd like these to be resolved before 1.8:
TIKA-1584 -- I'll fix
TIKA-1575 -- Resolved by Konstantin Gribov (thank you!)
TIKA-1512 -- I'll put in a temporary fix so that we don't get IOOBEs, but I'll
leave this open and do some more digging to see if we need to open a
49 matches
Mail list logo