[
https://issues.apache.org/jira/browse/TIKA-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17798538#comment-17798538
]
David Pilato commented on TIKA-3629:
I was looking at this one today. I guess we did n
[
https://issues.apache.org/jira/browse/TIKA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17629794#comment-17629794
]
David Pilato commented on TIKA-2536:
For future readers, the workaround to depend on T
[
https://issues.apache.org/jira/browse/TIKA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17627657#comment-17627657
]
David Pilato commented on TIKA-2536:
But wait, it's shaded now??? So I should not have
[
https://issues.apache.org/jira/browse/TIKA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17627656#comment-17627656
]
David Pilato commented on TIKA-2536:
That's weird... I'm not seeing the same thing...
[
https://issues.apache.org/jira/browse/TIKA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17627646#comment-17627646
]
David Pilato commented on TIKA-2536:
Ha right! Thanks for pointing this out [~tallison
[
https://issues.apache.org/jira/browse/TIKA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17627609#comment-17627609
]
David Pilato commented on TIKA-2536:
Hey team
netcdf 4.5.5 depends on cdm 4.5.5 which
[
https://issues.apache.org/jira/browse/TIKA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17627609#comment-17627609
]
David Pilato edited comment on TIKA-2536 at 11/2/22 10:42 AM:
--
[
https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612949#comment-17612949
]
David Pilato commented on TIKA-3812:
Amazing! That helps!
I definitely want to read t
[
https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612937#comment-17612937
]
David Pilato commented on TIKA-3812:
When excluding {{GDALParser}} from the {{{}Defaul
[
https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612667#comment-17612667
]
David Pilato commented on TIKA-3812:
I'm totally fine modifying the code on my side to
[
https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612645#comment-17612645
]
David Pilato commented on TIKA-3812:
[~tallison] I always had {{tika-parsers-scientifi
[
https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612617#comment-17612617
]
David Pilato commented on TIKA-3812:
I'm still having issues with 2.5.0.
Basically my
[
https://issues.apache.org/jira/browse/TIKA-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17610516#comment-17610516
]
David Pilato commented on TIKA-3863:
Amazing! That's a good addition.
> Add a pipes
[
https://issues.apache.org/jira/browse/TIKA-3659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17480437#comment-17480437
]
David Pilato commented on TIKA-3659:
AFAIK it's not part of the FTP Client. I believe
[
https://issues.apache.org/jira/browse/TIKA-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462833#comment-17462833
]
David Pilato commented on TIKA-3629:
I'm not sure I got it.
Is {{Office.KEYWORDS}
[
https://issues.apache.org/jira/browse/TIKA-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462555#comment-17462555
]
David Pilato edited comment on TIKA-3629 at 12/20/21, 11:32 AM:
[
https://issues.apache.org/jira/browse/TIKA-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462555#comment-17462555
]
David Pilato commented on TIKA-3629:
Update:
I can see the keywords available in:
*
David Pilato created TIKA-3629:
--
Summary: Keywords are not extracted anymore from PDF documents
Key: TIKA-3629
URL: https://issues.apache.org/jira/browse/TIKA-3629
Project: Tika
Issue Type: Bug
[
https://issues.apache.org/jira/browse/TIKA-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1746#comment-1746
]
David Pilato commented on TIKA-3610:
That's very good. So I believe we are all set and
David Pilato created TIKA-3610:
--
Summary: Emit errors to a specific emitter
Key: TIKA-3610
URL: https://issues.apache.org/jira/browse/TIKA-3610
Project: Tika
Issue Type: New Feature
Co
[
https://issues.apache.org/jira/browse/TIKA-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17385505#comment-17385505
]
David Pilato commented on TIKA-3493:
{quote}It doesn't look like the RTF specifies a t
[
https://issues.apache.org/jira/browse/TIKA-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17385445#comment-17385445
]
David Pilato edited comment on TIKA-3493 at 7/22/21, 11:59 AM:
-
[
https://issues.apache.org/jira/browse/TIKA-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17385445#comment-17385445
]
David Pilato commented on TIKA-3493:
I attached a patch which adds a unit test.
It i
[
https://issues.apache.org/jira/browse/TIKA-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Pilato updated TIKA-3493:
---
Attachment: Test_case_to_demo_the_change_with_Tika_1_x1.patch
> dcterms:created date depends on the cu
[
https://issues.apache.org/jira/browse/TIKA-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Pilato updated TIKA-3493:
---
Attachment: (was: Test_case_to_demo_the_change_with_Tika_1_x.patch)
> dcterms:created date depends
[
https://issues.apache.org/jira/browse/TIKA-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Pilato updated TIKA-3493:
---
Attachment: Test_case_to_demo_the_change_with_Tika_1_x.patch
> dcterms:created date depends on the cur
David Pilato created TIKA-3493:
--
Summary: dcterms:created date depends on the current TimeZone in
RTF documents
Key: TIKA-3493
URL: https://issues.apache.org/jira/browse/TIKA-3493
Project: Tika
[
https://issues.apache.org/jira/browse/TIKA-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Pilato resolved TIKA-3224.
Fix Version/s: 1.27
Resolution: Fixed
> Stackoverflow with Embedded PDF in DOCX document
> --
[
https://issues.apache.org/jira/browse/TIKA-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17384332#comment-17384332
]
David Pilato commented on TIKA-3224:
Oh I was confused. PDFBox 2.0.24 is in Tika 1.27.
[
https://issues.apache.org/jira/browse/TIKA-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17384284#comment-17384284
]
David Pilato commented on TIKA-3224:
I just tested Tika 1.27 with PDFBox 2.0.24 and it
[
https://issues.apache.org/jira/browse/TIKA-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17330882#comment-17330882
]
David Pilato edited comment on TIKA-3364 at 4/23/21, 4:05 PM:
--
[
https://issues.apache.org/jira/browse/TIKA-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17330882#comment-17330882
]
David Pilato edited comment on TIKA-3364 at 4/23/21, 4:04 PM:
--
[
https://issues.apache.org/jira/browse/TIKA-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17330882#comment-17330882
]
David Pilato edited comment on TIKA-3364 at 4/23/21, 4:03 PM:
--
[
https://issues.apache.org/jira/browse/TIKA-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17330882#comment-17330882
]
David Pilato edited comment on TIKA-3364 at 4/23/21, 4:03 PM:
--
[
https://issues.apache.org/jira/browse/TIKA-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17330882#comment-17330882
]
David Pilato commented on TIKA-3364:
Oh my god! I'm feeling stupid.
Anyway, I was not
[
https://issues.apache.org/jira/browse/TIKA-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17330824#comment-17330824
]
David Pilato edited comment on TIKA-3364 at 4/23/21, 2:39 PM:
--
[
https://issues.apache.org/jira/browse/TIKA-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17330824#comment-17330824
]
David Pilato commented on TIKA-3364:
So I trie this:
{code:java}
PDFPars
[
https://issues.apache.org/jira/browse/TIKA-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17330824#comment-17330824
]
David Pilato edited comment on TIKA-3364 at 4/23/21, 2:38 PM:
--
David Pilato created TIKA-3364:
--
Summary: PDF Content is extracted twice
Key: TIKA-3364
URL: https://issues.apache.org/jira/browse/TIKA-3364
Project: Tika
Issue Type: Bug
Components: p
[
https://issues.apache.org/jira/browse/TIKA-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17259813#comment-17259813
]
David Pilato commented on TIKA-3258:
I really like having {{auto}} as the default mode
[
https://issues.apache.org/jira/browse/TIKA-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Pilato updated TIKA-3224:
---
Description:
This issue has been reported by a user on
[discuss.elastic.co|https://discuss.elastic.co
[
https://issues.apache.org/jira/browse/TIKA-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Pilato updated TIKA-3224:
---
Attachment: issue-stackoverflow.docx
> Stackoverflow with Embedded PDF in DOCX document
>
David Pilato created TIKA-3224:
--
Summary: Stackoverflow with Embedded PDF in DOCX document
Key: TIKA-3224
URL: https://issues.apache.org/jira/browse/TIKA-3224
Project: Tika
Issue Type: Bug
[
https://issues.apache.org/jira/browse/TIKA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044358#comment-17044358
]
David Pilato edited comment on TIKA-3006 at 2/25/20 11:42 AM:
--
[
https://issues.apache.org/jira/browse/TIKA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044358#comment-17044358
]
David Pilato edited comment on TIKA-3006 at 2/25/20 11:41 AM:
--
[
https://issues.apache.org/jira/browse/TIKA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044358#comment-17044358
]
David Pilato commented on TIKA-3006:
All good for me.
I noticed that new meta data ar
[
https://issues.apache.org/jira/browse/TIKA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044349#comment-17044349
]
David Pilato commented on TIKA-3006:
Well. It does not work as I'd love it seeing work
[
https://issues.apache.org/jira/browse/TIKA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17043830#comment-17043830
]
David Pilato commented on TIKA-3006:
Is it possible to get a SNAPSHOT version of this?
[
https://issues.apache.org/jira/browse/TIKA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17035475#comment-17035475
]
David Pilato commented on TIKA-3006:
Could you confirm that it is regression? As I can
[
https://issues.apache.org/jira/browse/TIKA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16992068#comment-16992068
]
David Pilato commented on TIKA-3006:
For whatever reason, the external link is replace
[
https://issues.apache.org/jira/browse/TIKA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Pilato updated TIKA-3006:
---
Description:
Hey team.
I have a unit test which is not passing anymore with Tika 1.23. Code is
[h
[
https://issues.apache.org/jira/browse/TIKA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Pilato updated TIKA-3006:
---
Description:
Hey team.
I have a unit test which is not passing anymore with Tika 1.23. Code is
[h
[
https://issues.apache.org/jira/browse/TIKA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Pilato updated TIKA-3006:
---
Attachment: test.pdf
> Regression in PDF keywords extraction since 1.23
>
[
https://issues.apache.org/jira/browse/TIKA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Pilato updated TIKA-3006:
---
Description:
Hey team.
I have a unit test which is not passing anymore with Tika 1.23. Code is
[h
David Pilato created TIKA-3006:
--
Summary: Regression in PDF keywords extraction since 1.23
Key: TIKA-3006
URL: https://issues.apache.org/jira/browse/TIKA-3006
Project: Tika
Issue Type: Bug
David Pilato created TIKA-2579:
--
Summary: Update to PDFBox 2.0.9
Key: TIKA-2579
URL: https://issues.apache.org/jira/browse/TIKA-2579
Project: Tika
Issue Type: Improvement
Components: p
[
https://issues.apache.org/jira/browse/TIKA-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Pilato closed TIKA-2227.
--
Resolution: Not A Problem
> Replacement of MSOffice#KEYWORDS for RTF and ODT docs
> -
[
https://issues.apache.org/jira/browse/TIKA-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15771026#comment-15771026
]
David Pilato commented on TIKA-2227:
Sorry. Answer is {{TikaCoreProperties.KEYWORDS}}.
David Pilato created TIKA-2227:
--
Summary: Replacement of MSOffice#KEYWORDS for RTF and ODT docs
Key: TIKA-2227
URL: https://issues.apache.org/jira/browse/TIKA-2227
Project: Tika
Issue Type: Bug
[
https://issues.apache.org/jira/browse/TIKA-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15758897#comment-15758897
]
David Pilato commented on TIKA-2208:
Adding missing libs
{code}
compile "com.github.
[
https://issues.apache.org/jira/browse/TIKA-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15758867#comment-15758867
]
David Pilato commented on TIKA-2208:
So we now have a regression in Elasticsearch tests
[
https://issues.apache.org/jira/browse/TIKA-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15758867#comment-15758867
]
David Pilato edited comment on TIKA-2208 at 12/18/16 1:50 PM:
--
[
https://issues.apache.org/jira/browse/TIKA-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15754624#comment-15754624
]
David Pilato commented on TIKA-2208:
I agree with you on 2). That would give even more
[
https://issues.apache.org/jira/browse/TIKA-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15754595#comment-15754595
]
David Pilato commented on TIKA-2208:
I can confirm that your workaround works perfectly
[
https://issues.apache.org/jira/browse/TIKA-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15754574#comment-15754574
]
David Pilato commented on TIKA-2208:
The original reporter told me that we can reuse it
[
https://issues.apache.org/jira/browse/TIKA-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15754456#comment-15754456
]
David Pilato commented on TIKA-2208:
I did not try with a pure Visio file though.
> Ca
[
https://issues.apache.org/jira/browse/TIKA-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15754454#comment-15754454
]
David Pilato commented on TIKA-2208:
I got this document from the user who reported the
[
https://issues.apache.org/jira/browse/TIKA-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15754338#comment-15754338
]
David Pilato commented on TIKA-2208:
That is correct. Thanks!
> Catch missing libraire
[
https://issues.apache.org/jira/browse/TIKA-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15753790#comment-15753790
]
David Pilato edited comment on TIKA-2208 at 12/16/16 8:17 AM:
--
[
https://issues.apache.org/jira/browse/TIKA-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15753790#comment-15753790
]
David Pilato edited comment on TIKA-2208 at 12/16/16 8:16 AM:
--
[
https://issues.apache.org/jira/browse/TIKA-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15753790#comment-15753790
]
David Pilato commented on TIKA-2208:
So I tried this way.
Basically I declared ``
But
[
https://issues.apache.org/jira/browse/TIKA-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15748161#comment-15748161
]
David Pilato commented on TIKA-2208:
Looks like a good idea. Let me try it and come bac
David Pilato created TIKA-2208:
--
Summary: Catch missing libraires
Key: TIKA-2208
URL: https://issues.apache.org/jira/browse/TIKA-2208
Project: Tika
Issue Type: Improvement
Components:
[
https://issues.apache.org/jira/browse/TIKA-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Pilato updated TIKA-2030:
---
Attachment: test.docx
test.odt
ODT and DOCX files
> A space is suppressed when parsing
David Pilato created TIKA-2030:
--
Summary: A space is suppressed when parsing Odt file
Key: TIKA-2030
URL: https://issues.apache.org/jira/browse/TIKA-2030
Project: Tika
Issue Type: Bug
[
https://issues.apache.org/jira/browse/TIKA-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14333470#comment-14333470
]
David Pilato commented on TIKA-1526:
I just ran a test on my machine:
With Tika 1.7 an
[
https://issues.apache.org/jira/browse/TIKA-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329509#comment-14329509
]
David Pilato commented on TIKA-1557:
Thanks! I'd not qualify it as a bug though. :)
>
[
https://issues.apache.org/jira/browse/TIKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329452#comment-14329452
]
David Pilato commented on TIKA-1555:
Well I could try but for now I did not manage to r
[
https://issues.apache.org/jira/browse/TIKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329446#comment-14329446
]
David Pilato commented on TIKA-1555:
I read the code and it sounds like to me that is t
[
https://issues.apache.org/jira/browse/TIKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329300#comment-14329300
]
David Pilato commented on TIKA-1555:
Thank you Uwe. I don't understand why I was not ab
David Pilato created TIKA-1555:
--
Summary: posix_spawn is not a supported process launch mechanism
on this platform
Key: TIKA-1555
URL: https://issues.apache.org/jira/browse/TIKA-1555
Project: Tika
[
https://issues.apache.org/jira/browse/TIKA-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322524#comment-14322524
]
David Pilato commented on TIKA-1548:
Thanks for fixing this. To answer to your question
David Pilato created TIKA-1548:
--
Summary: System property added while catching exception on parsing
PDF encrypted doc
Key: TIKA-1548
URL: https://issues.apache.org/jira/browse/TIKA-1548
Project: Tika
[
https://issues.apache.org/jira/browse/TIKA-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946836#comment-13946836
]
David Pilato commented on TIKA-1165:
Sounds like I never answered to your comment! Sham
David Pilato created TIKA-1165:
--
Summary: Autodetect and parse Asciidoc
Key: TIKA-1165
URL: https://issues.apache.org/jira/browse/TIKA-1165
Project: Tika
Issue Type: Wish
Components: l
85 matches
Mail list logo