[ https://issues.apache.org/jira/browse/TIKA-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983813#comment-13983813 ]
Tim Allison commented on TIKA-1283: ----------------------------------- [~thaichat04], thank you, as always. By "thumbnail," I'd also want to include images/icons of documents that are included only for display purposes. For example, the icon image (image1.emf) in test-documents/EmbeddedPDF.docx doesn't have a "relationship"=thumbnail, but I'd want to include that as a thumbnail because it appears as an <v:shape> within a <w:object>. The point you make about the differences in handling of these by application is right on. Each application links thumbnail images to the underlying data in different ways, and we'll have to go application by application to do this correctly (whether we go with this or TIKA-90) I'm not held to the original proposal in this issue, and I like the clarity of TIKA-90 quite a bit. Some other thoughts...the signature I proposed above won't work because a given image can have more than one thumbnail (at least for RTFs) and it misses metadata around the thumbnail image (such as mediaType of the thumbnail). > Add "thumbnail" as possible metadata item to TikaCoreProperties > --------------------------------------------------------------- > > Key: TIKA-1283 > URL: https://issues.apache.org/jira/browse/TIKA-1283 > Project: Tika > Issue Type: Improvement > Components: metadata > Reporter: Tim Allison > Priority: Minor > > TIKA-90 originally requested to add thumbnails to a document's metadata. > I'd like to have a unified way of determining whether an embedded > document/resource is a thumbnail or a regular attachment. > With the changes in TIKA-1223 (ooxml) and TIKA-1010 (rtf), we are now pulling > out more thumbnails than before. > I propose adding "tika:thumbnail" to the metadata of each thumbnail image. > The consumer can then determine what to do with the embedded resource based > on the metadata. -- This message was sent by Atlassian JIRA (v6.2#6252)