[ https://issues.apache.org/jira/browse/TIKA-2518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16279281#comment-16279281 ]
Ryan Brueske commented on TIKA-2518: ------------------------------------ I can send the warnings to /dev/null to clean the output. That syntax should probably be documented at the least. I would think the default behavior would not be to output any warnings and then have a verbose or some switch that allows warnings. If I am only trying to read an excel file I don't know why I would want warnings about image formats? Furthermore the link supplied in the warning output does not describe how to ensure the warnings do not occur. The syntax I used is to suppress the warnings: {code} java -jar tika-app-1.16.jar --list-parsers 2> /dev/null {code} When sending the warnings to a file: {code} java -jar tika-apjava -jar tika-app-1.16.jar --list-parsers 2> error.txt {code} the contents are: {code} Dec 05, 2017 4:05:37 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: JBIG2ImageReader not loaded. jbig2 files will be ignored See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io for optional dependencies. TIFFImageWriter not loaded. tiff files will not be processed See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io for optional dependencies. J2KImageReader not loaded. JPEG2000 files will not be processed. See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io for optional dependencies. Dec 05, 2017 4:05:37 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: org.xerial's sqlite-jdbc is not loaded. Please provide the jar on your classpath to parse sqlite files. See tika-parsers/pom.xml for the correct version. {code} If there are optional dependencies necessary in order to avoid the warnings, it would be helpful to document that. The jai-image-io link listed in the warning documents what GAV to add to a maven pom.xml. I am not running maven (or any other build tool), merely trying to parse an Excel file. Also, the last warning for sqllite-jdbc says to look in a tika-parsers/pom.xml, as I am not building this from source, I don't have any maven files. I had found this page: https://wiki.apache.org/tika/Troubleshooting%20Tika#Wrong_Parser_Used which specifies a jvm argument -Dorg.apache.tika.service.error.warn which should be able to *enable* warnings. When I tried to pass it in with a value of false I still get the warnings: {code} java -Dorg.apache.tika.service.error.warn=false -jar tika-app-1.16.jar --list-parsers Dec 05, 2017 4:15:42 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: JBIG2ImageReader not loaded. jbig2 files will be ignored See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io for optional dependencies. TIFFImageWriter not loaded. tiff files will not be processed See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io for optional dependencies. J2KImageReader not loaded. JPEG2000 files will not be processed. See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io for optional dependencies. Dec 05, 2017 4:15:42 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: org.xerial's sqlite-jdbc is not loaded. Please provide the jar on your classpath to parse sqlite files. See tika-parsers/pom.xml for the correct version. org.apache.tika.parser.AutoDetectParser (Composite Parser): ... {code} My expectation is that all program options only output what is relevant. I totally understand that some times it is desirable to have warning output. I don't think it should be the default behavior across multiple commands nor that you would have to determine that warnings are treated as stderr output and that there is no documentation on how to turn them off and that the existing documentation states how to turn them on. > tika app outputs warnings by default > ------------------------------------ > > Key: TIKA-2518 > URL: https://issues.apache.org/jira/browse/TIKA-2518 > Project: Tika > Issue Type: Bug > Components: app > Affects Versions: 1.16 > Reporter: Ryan Brueske > > upon downloading the latest tika and trying basic commands it spews unwanted > warnings, which makes parsing output necessary. > Example 1: > {code} > java -jar tika-app-1.16.jar --list-detectors > Dec 05, 2017 3:16:13 PM org.apache.tika.config.InitializableProblemHandler$3 > handleInitializableProblem > WARNING: JBIG2ImageReader not loaded. jbig2 files will be ignored > See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io > for optional dependencies. > TIFFImageWriter not loaded. tiff files will not be processed > See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io > for optional dependencies. > J2KImageReader not loaded. JPEG2000 files will not be processed. > See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io > for optional dependencies. > Dec 05, 2017 3:16:13 PM org.apache.tika.config.InitializableProblemHandler$3 > handleInitializableProblem > WARNING: org.xerial's sqlite-jdbc is not loaded. > Please provide the jar on your classpath to parse sqlite files. > See tika-parsers/pom.xml for the correct version. > org.apache.tika.detect.DefaultDetector (Composite Detector): > org.apache.tika.parser.microsoft.POIFSContainerDetector > org.apache.tika.parser.pkg.ZipContainerDetector > org.gagravarr.tika.OggDetector > org.apache.tika.mime.MimeTypes > {code} > Example 2: > {code} > java -jar tika-app-1.16.jar --text my.xlsx > Dec 05, 2017 3:00:22 PM org.apache.tika.config.InitializableProblemHandler$3 > handleInitializableProblem > WARNING: JBIG2ImageReader not loaded. jbig2 files will be ignored > See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io > for optional dependencies. > TIFFImageWriter not loaded. tiff files will not be processed > See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io > for optional dependencies. > J2KImageReader not loaded. JPEG2000 files will not be processed. > See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io > for optional dependencies. > Dec 05, 2017 3:00:22 PM org.apache.tika.config.InitializableProblemHandler$3 > handleInitializableProblem > WARNING: org.xerial's sqlite-jdbc is not loaded. > Please provide the jar on your classpath to parse sqlite files. > See tika-parsers/pom.xml for the correct version. > INFO As a convenience, TikaCLI has turned on extraction of > inline images for the PDFParser (TIKA-2374). > This is not the default option in Tika generally or in tika-server. > As a convenience, TikaCLI has turned on extraction of > inline images for the PDFParser (TIKA-2374). > This is not the default option in Tika generally or in tika-server. > {code} > The expected behavior is to return only the requested information. I do not > see a switch to turn off or control unrequested warnings. > I can't imagine this is the correct behavior. It is not documented, nor could > I find why such output exists. -- This message was sent by Atlassian JIRA (v6.4.14#64029)