[jira] [Created] (TIKA-3047) Upgrade to POI 4.1.2

2020-02-14 Thread Tim Allison (Jira)
Tim Allison created TIKA-3047: - Summary: Upgrade to POI 4.1.2 Key: TIKA-3047 URL: https://issues.apache.org/jira/browse/TIKA-3047 Project: Tika Issue Type: Task Reporter: Tim Allison

[COMPRESS and Tika/PDFBox/POI] files from bug trackers

2020-02-14 Thread Tim Allison
All, I recently downloaded attachments from the following bug trackers: COMPRESS, TIKA, PDFBox, POI, Open Office, Libre Office and ghostscript: http://162.242.228.174/docs/bugtrackers/ I then unpackaged/uncompressed all of the package/compressed files so: COMPRESS-115-1.zip is the second fil

[jira] [Updated] (TIKA-3046) Add detection of some open office related formats

2020-02-14 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-3046: -- Description: Add format detection for .cdr, .bau, .sob, .oxt, .odp, .odb. In unpacking attachments to Li

[jira] [Created] (TIKA-3046) Add detection of some open office related formats

2020-02-14 Thread Tim Allison (Jira)
Tim Allison created TIKA-3046: - Summary: Add detection of some open office related formats Key: TIKA-3046 URL: https://issues.apache.org/jira/browse/TIKA-3046 Project: Tika Issue Type: Task

Re: Tika Python not recognizing content.

2020-02-14 Thread Max Franklin
Hi, Just following up on this. Do you know why my code isn’t working? Thank you, Max > On Feb 10, 2020, at 2:04 PM, Max Franklin wrote: > > Hello, > > > I'm sorry for the inconvenience, but I've been using Tika as part of a > Python code to extract text from PDFs and convert it into a TXT f

[jira] [Created] (TIKA-3045) Allow users to run custom parsing of xfa and xmp

2020-02-14 Thread Tim Allison (Jira)
Tim Allison created TIKA-3045: - Summary: Allow users to run custom parsing of xfa and xmp Key: TIKA-3045 URL: https://issues.apache.org/jira/browse/TIKA-3045 Project: Tika Issue Type: Task

[jira] [Commented] (TIKA-3043) vorbis-java-tika overwrites tika's Parser and Detector in MANIFEST

2020-02-14 Thread CHARUSHEELA BOPARDIKAR (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036912#comment-17036912 ] CHARUSHEELA BOPARDIKAR commented on TIKA-3043: -- Tried adding this exclusion i