[ https://issues.apache.org/jira/browse/TIKA-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17883883#comment-17883883 ]
ASF GitHub Bot commented on TIKA-3637: -------------------------------------- tballison commented on code in PR #492: URL: https://github.com/apache/tika/pull/492#discussion_r1771454305 ########## tika-core/src/main/resources/org/apache/tika/parser/external/tika-external-parsers.xml: ########## @@ -61,4 +61,57 @@ <match>\s*([A-Za-z0-9/ \(\)]+\S{1})\s+:\s+([A-Za-z0-9\(\)\[\] \:\-\.]+)\s*</match> </metadata> </parser> + <parser> + <check> + <command>sox --version</command> + <error-codes>126,127</error-codes> + </check> + <command>env FOO=${OUTPUT} sox --info ${INPUT}</command> Review Comment: Why is the `env FOO=${OUTPUT}` command required? Shouldn't `sox --info` write to stdout? > Adding sox audio tool to external parsers > ----------------------------------------- > > Key: TIKA-3637 > URL: https://issues.apache.org/jira/browse/TIKA-3637 > Project: Tika > Issue Type: Improvement > Components: parser > Reporter: Leszek Sliwko > Priority: Minor > > Sox tool correctly pulls duration from wav files. I haven't seen any tests > for external parsers anywhere. > Sample output from sox --info duration-test-3.wav: > Input File : 'duration-test-3.wav' > Channels : 1 > Sample Rate : 44100 > Precision : 16-bit > Duration : 00:00:02.50 = 110298 samples = 187.582 CDDA sectors > File Size : 221k > Bit Rate : 706k > Sample Encoding: 16-bit Signed Integer PCM > > Pull request: https://github.com/apache/tika/pull/282 -- This message was sent by Atlassian Jira (v8.20.10#820010)