[jira] [Commented] (TIKA-3864) Non-ascii UTF-8 characters in fetchKey not working with FileSystemFetcher

2022-10-04 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612827#comment-17612827 ] Hudson commented on TIKA-3864: -- UNSTABLE: Integrated in Jenkins build Tika » tika-main-jdk8 #

[jira] [Commented] (TIKA-3864) Non-ascii UTF-8 characters in fetchKey not working with FileSystemFetcher

2022-10-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612813#comment-17612813 ] Tim Allison commented on TIKA-3864: --- I added a {{fetchKeyLiteral}} check in case this br

[jira] [Resolved] (TIKA-3864) Non-ascii UTF-8 characters in fetchKey not working with FileSystemFetcher

2022-10-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-3864. --- Fix Version/s: 2.5.1 Resolution: Fixed > Non-ascii UTF-8 characters in fetchKey not working wit

[jira] [Commented] (TIKA-1735) Unsupported AutoCAD drawing version: AC1027

2022-10-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612808#comment-17612808 ] Tim Allison commented on TIKA-1735: --- Standing by! Whenever... > Unsupported AutoCAD dr

[jira] [Commented] (TIKA-1735) Unsupported AutoCAD drawing version: AC1027

2022-10-04 Thread Dan Coldrick (Jira)
[ https://issues.apache.org/jira/browse/TIKA-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612801#comment-17612801 ] Dan Coldrick commented on TIKA-1735: [~tallison]  Happy with that, can you give me unt

[jira] [Commented] (TIKA-1735) Unsupported AutoCAD drawing version: AC1027

2022-10-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612794#comment-17612794 ] Tim Allison commented on TIKA-1735: --- At this point in the release cycle, I'd be willing

[jira] [Updated] (TIKA-1735) Unsupported AutoCAD drawing version: AC1027

2022-10-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-1735: -- Issue Type: New Feature (was: Bug) > Unsupported AutoCAD drawing version: AC1027 >

[jira] [Commented] (TIKA-1735) Unsupported AutoCAD drawing version: AC1027

2022-10-04 Thread Dan Coldrick (Jira)
[ https://issues.apache.org/jira/browse/TIKA-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612781#comment-17612781 ] Dan Coldrick commented on TIKA-1735: [~tallison]  Apologies I've been missing for a fe

[jira] [Commented] (TIKA-3864) Non-ascii UTF-8 characters in fetchKey not working with FileSystemFetcher

2022-10-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612768#comment-17612768 ] Tim Allison commented on TIKA-3864: --- If there aren't objections from fellow devs or anyo

[jira] [Commented] (TIKA-3812) Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

2022-10-04 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612732#comment-17612732 ] Hudson commented on TIKA-3812: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk8 #8

[jira] [Resolved] (TIKA-3141) LINUX - Tika shouldn't throw an exception for an empty TIKA_CONFIG environment variable value

2022-10-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-3141. --- Resolution: Fixed > LINUX - Tika shouldn't throw an exception for an empty TIKA_CONFIG > environment

[jira] [Commented] (TIKA-3812) Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

2022-10-04 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612695#comment-17612695 ] Hudson commented on TIKA-3812: -- FAILURE: Integrated in Jenkins build Tika » tika-main-jdk8 #8

[jira] [Commented] (TIKA-3871) Improve checkstyle

2022-10-04 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612694#comment-17612694 ] Hudson commented on TIKA-3871: -- FAILURE: Integrated in Jenkins build Tika » tika-main-jdk8 #8

[jira] [Commented] (TIKA-3812) Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

2022-10-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612671#comment-17612671 ] Tim Allison commented on TIKA-3812: --- https://github.com/apache/tika/blob/main/tika-parse

[jira] [Comment Edited] (TIKA-3812) Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

2022-10-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612671#comment-17612671 ] Tim Allison edited comment on TIKA-3812 at 10/4/22 4:28 PM: Th

[jira] [Comment Edited] (TIKA-3812) Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

2022-10-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612665#comment-17612665 ] Tim Allison edited comment on TIKA-3812 at 10/4/22 4:09 PM: In

[jira] [Commented] (TIKA-3812) Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

2022-10-04 Thread David Pilato (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612667#comment-17612667 ] David Pilato commented on TIKA-3812: I'm totally fine modifying the code on my side to

[jira] [Commented] (TIKA-3812) Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

2022-10-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612666#comment-17612666 ] Tim Allison commented on TIKA-3812: --- So, the proposal would be to remove png, jpeg and g

[jira] [Commented] (TIKA-3812) Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

2022-10-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612665#comment-17612665 ] Tim Allison commented on TIKA-3812: --- In 1.x, the regular image parsers took precedence o

[jira] [Commented] (TIKA-3812) Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

2022-10-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612664#comment-17612664 ] Tim Allison commented on TIKA-3812: --- Y, sorry. That test actually shows that gdal is be

[jira] [Commented] (TIKA-3812) Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

2022-10-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612656#comment-17612656 ] Tim Allison commented on TIKA-3812: --- I added two unit tests (thank you for the png!): h

[jira] [Commented] (TIKA-3870) Migrate testcontainers to junit5 where possible

2022-10-04 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612646#comment-17612646 ] Hudson commented on TIKA-3870: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk8 #8

[jira] [Commented] (TIKA-3812) Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

2022-10-04 Thread David Pilato (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612645#comment-17612645 ] David Pilato commented on TIKA-3812: [~tallison] I always had {{tika-parsers-scientifi

[jira] [Resolved] (TIKA-3871) Improve checkstyle

2022-10-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-3871. --- Fix Version/s: 2.5.1 Resolution: Fixed > Improve checkstyle > -- > >

[jira] [Resolved] (TIKA-3870) Migrate testcontainers to junit5 where possible

2022-10-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-3870. --- Fix Version/s: 2.5.1 Resolution: Fixed As a side effect of this update, I confirmed that tests

[jira] [Comment Edited] (TIKA-3812) Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

2022-10-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612638#comment-17612638 ] Tim Allison edited comment on TIKA-3812 at 10/4/22 2:51 PM: Ug

[jira] [Commented] (TIKA-3812) Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

2022-10-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612638#comment-17612638 ] Tim Allison commented on TIKA-3812: --- Ugh [~dadoonet]...sorry. Are you using tika-scient

[jira] [Created] (TIKA-3871) Improve checkstyle

2022-10-04 Thread Tim Allison (Jira)
Tim Allison created TIKA-3871: - Summary: Improve checkstyle Key: TIKA-3871 URL: https://issues.apache.org/jira/browse/TIKA-3871 Project: Tika Issue Type: Task Reporter: Tim Allison

[jira] [Commented] (TIKA-3812) Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

2022-10-04 Thread David Pilato (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612617#comment-17612617 ] David Pilato commented on TIKA-3812: I'm still having issues with 2.5.0. Basically my