[ https://issues.apache.org/jira/browse/TIKA-4464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gregor Lang updated TIKA-4464: ------------------------------ Description: When parsing *.pages or *.numbers files the resulting mime-type is always "application/vnd.apple.unknown.13" There seems to be a todo in *IWork13PackageParser* at line 319, which is probably related. {code:java} // Is it the main document? if (name.equals(IWORK13_MAIN_ENTRY)) { // TODO Decode the snappy stream, and check for the Message Type // = 2 (TN::SheetArchive), it is a numbers file; // = 10000 (TP::DocumentArchive), that's a pages file return null; } {code} was: When parsing *.pages or *.numbers files the resulting mime-type is always " application/vnd.apple.unknown.13" There seems to be a todo in *IWork13PackageParser* at line 319, which is probably related. {code:java} // Is it the main document? if (name.equals(IWORK13_MAIN_ENTRY)) { // TODO Decode the snappy stream, and check for the Message Type // = 2 (TN::SheetArchive), it is a numbers file; // = 10000 (TP::DocumentArchive), that's a pages file return null; } {code} > Parsing IWork files results in unknown mimetype > ----------------------------------------------- > > Key: TIKA-4464 > URL: https://issues.apache.org/jira/browse/TIKA-4464 > Project: Tika > Issue Type: Bug > Components: detector, parser > Affects Versions: 3.2.1 > Reporter: Gregor Lang > Priority: Minor > Attachments: sample-2.pages, sample.key, sample.numbers, sample.pages > > > When parsing *.pages or *.numbers files the resulting mime-type is always > "application/vnd.apple.unknown.13" > > There seems to be a todo in *IWork13PackageParser* at line 319, which is > probably related. > {code:java} > // Is it the main document? > if (name.equals(IWORK13_MAIN_ENTRY)) { > // TODO Decode the snappy stream, and check for the Message Type > // = 2 (TN::SheetArchive), it is a numbers file; > // = 10000 (TP::DocumentArchive), that's a pages file > return null; > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)