[ 
https://issues.apache.org/jira/browse/TIKA-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jukka Zitting resolved TIKA-1722.
---------------------------------
    Resolution: Fixed
      Assignee: Jukka Zitting

Thanks! Committed in revision 1698100.

My original thinking with these methods was to ensure that there is no 
difference in how a File or a file:// URL gets processed. I think that's 
already well covered, so there's not much need for the extra File->URL->File 
roundtrip.

> Tika methods that accept a File needlessly convert it to a URL
> --------------------------------------------------------------
>
>                 Key: TIKA-1722
>                 URL: https://issues.apache.org/jira/browse/TIKA-1722
>             Project: Tika
>          Issue Type: Improvement
>          Components: core
>            Reporter: Yaniv Kunda
>            Assignee: Jukka Zitting
>            Priority: Minor
>             Fix For: 1.11
>
>         Attachments: TIKA-1722.patch
>
>
> The following methods:
> - Tika.detect(File)
> - Tika.parse(File)
> - Tika.parseToString(File)
> Convert the given File to a URL and use the corresponding overloaded method 
> that accepts a URL.
> This seems like a shortcut, but essentially does the following:
> # Converts the file to a URI
> # Converts the URI to a URL
> # Calls TikaInputStream.get(URL, Metadata), which then performs the following 
> special handling:
> # Checks if the protocol is "file"
> # Tries to convert the URL (back) to a URI
> # Creates a File around the URI
> # Checks if file.isFile() 
> # Calls TikaInputStream.get(File, Metadata)
> The special handling in TikaInputStream.get(URL/URI) is a good optimization 
> for in-the-wild file resources, but for internal uses it can be skipped - 
> making Tika call TikaInputStream.get(File, Metadata) directly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to