You did great, thanks André! Cheers, Chris
On 8/6/10 8:13 AM, "André Ricardo" <[email protected]> wrote: Hello Chris, Just opened the issue, I hope I did everything ok since it is the first time I'm opening an issue in JIRA. Thank you for your answer, André Ricardo On Thu, Aug 5, 2010 at 10:19 PM, Mattmann, Chris A (388J) < [email protected]> wrote: > Hi André, > > Yes, please, file an issue in JIRA and point at the mp3 file and the test > case that failed. Thanks so much! > > Cheers, > Chris > > > > On 8/5/10 8:52 AM, "André Ricardo" <[email protected]> wrote: > > Hello, > > I was trying some mp3s in Tika coming from Nutch 0.9/1.0 samples and with > "A > corrupt MP3 file that has been truncated half way through the ID3v2 frames" > returned this: > > $ java -jar tika-app-0.7.jar -v -m > ~/nutch-0.9/src/plugin/parse-mp3/sample/test.mp3 > Exception in thread "main" org.apache.tika.exception.TikaException: > TIKA-198: Illegal IOException from > org.apache.tika.parser.mp3.mp3par...@1bf3d87 > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:138) > at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:99) > at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:169) > at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:62) > Caused by: java.io.IOException: Tried to read 259186 bytes, but only 65526 > bytes present > at org.apache.tika.parser.mp3.ID3v2Frame.readFully(ID3v2Frame.java:160) > at org.apache.tika.parser.mp3.ID3v2Frame.<init>(ID3v2Frame.java:110) > at > > org.apache.tika.parser.mp3.ID3v2Frame.createFrameIfPresent(ID3v2Frame.java:81) > at > org.apache.tika.parser.mp3.Mp3Parser.getAllTagHandlers(Mp3Parser.java:128) > at org.apache.tika.parser.mp3.Mp3Parser.parse(Mp3Parser.java:64) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:132) > ... 3 more > > Also tried with the latest trunk from github reproducing the problem: > > $ java -jar tika-app-0.8-SNAPSHOT.jar -v -m > ~/nutch-0.9/src/plugin/parse-mp3/sample/test.mp3 > Exception in thread "main" org.apache.tika.exception.TikaException: > TIKA-198: Illegal IOException from > org.apache.tika.parser.mp3.mp3par...@e79839 > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:169) > at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:110) > at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:193) > at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:72) > Caused by: java.io.IOException: Tried to read 259186 bytes, but only 65526 > bytes present > at org.apache.tika.parser.mp3.ID3v2Frame.readFully(ID3v2Frame.java:160) > at org.apache.tika.parser.mp3.ID3v2Frame.<init>(ID3v2Frame.java:110) > at > > org.apache.tika.parser.mp3.ID3v2Frame.createFrameIfPresent(ID3v2Frame.java:81) > at > org.apache.tika.parser.mp3.Mp3Parser.getAllTagHandlers(Mp3Parser.java:133) > at org.apache.tika.parser.mp3.Mp3Parser.parse(Mp3Parser.java:64) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:163) > ... 3 more > > The mp3 is here: > > http://github.com/apache/nutch/raw/tags/release-1.0/src/plugin/parse-mp3/sample/test.mp3 > > All the other mp3 samples were parsed well by Tika. > > Should I open an issue in Jira? And if so, would you consider this a bug or > an improvement? > > André Ricardo > > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Senior Computer Scientist > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 171-266B, Mailstop: 171-246 > Email: [email protected] > WWW: http://sunset.usc.edu/~mattmann/<http://sunset.usc.edu/%7Emattmann/> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Adjunct Assistant Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
