[ https://issues.apache.org/jira/browse/TIKA-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jukka Zitting resolved TIKA-1722. --------------------------------- Resolution: Fixed Assignee: Jukka Zitting Thanks! Committed in revision 1698100. My original thinking with these methods was to ensure that there is no difference in how a File or a file:// URL gets processed. I think that's already well covered, so there's not much need for the extra File->URL->File roundtrip. > Tika methods that accept a File needlessly convert it to a URL > -------------------------------------------------------------- > > Key: TIKA-1722 > URL: https://issues.apache.org/jira/browse/TIKA-1722 > Project: Tika > Issue Type: Improvement > Components: core > Reporter: Yaniv Kunda > Assignee: Jukka Zitting > Priority: Minor > Fix For: 1.11 > > Attachments: TIKA-1722.patch > > > The following methods: > - Tika.detect(File) > - Tika.parse(File) > - Tika.parseToString(File) > Convert the given File to a URL and use the corresponding overloaded method > that accepts a URL. > This seems like a shortcut, but essentially does the following: > # Converts the file to a URI > # Converts the URI to a URL > # Calls TikaInputStream.get(URL, Metadata), which then performs the following > special handling: > # Checks if the protocol is "file" > # Tries to convert the URL (back) to a URI > # Creates a File around the URI > # Checks if file.isFile() > # Calls TikaInputStream.get(File, Metadata) > The special handling in TikaInputStream.get(URL/URI) is a good optimization > for in-the-wild file resources, but for internal uses it can be skipped - > making Tika call TikaInputStream.get(File, Metadata) directly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)