On Fri, 3 Nov 2023 05:42:56 GMT, Alan Bateman <al...@openjdk.org> wrote:

>> Checking for _file_ in the URL scheme is not conclusive evidence that there 
>> is a local file path behind. I'll give a couple of examples. In Unix/Linux 
>> platforms, an URL of the form `file://example.com/path/to/some/file.txt` is 
>> processed with a remote FTP request (see Unix 
>> `sun.net.www.protocol.file.Handler`). In Windows, file URLs may be 
>> interpreted as UNCs but, if not possible, there is an FTP fallback (see 
>> Windows `sun.net.www.protocol.file.Handler`). While checking the host name 
>> in the URL is possible, there are three types of drawbacks: 1) a DNS query 
>> during the Security class initialization process should be avoided, 2) 
>> looking for hardcoded host names such as _localhost_ might lead to false 
>> negatives (i.e. a host is considered remote when it is not), and 3) there 
>> will be platform-specific and duplicated logic to deal with UNC file URLs. 
>> In addition, OpenJDK supports ill-formed relative path file URLs such as 
>> `file:some/relative/path`. In these cases, there is not a host name
  but there is a local file path underneath (relative to the current working 
directory). We did not find normative elements in [RFC 
8089](https://www.rfc-editor.org/rfc/rfc8089) for all previously described 
behaviors, that would have been helpful for a URL-based check. Misinterpreting 
a file URL as remote will unnecessarily block the possibility of relative path 
includes.
>> 
>> We think that `FileURLConnection` is the most accurate indicator of a local 
>> file path because it includes the decision logic that is specific to OpenJDK 
>> and varies depending on the platform.
>
>> Checking for file in the URL scheme is not conclusive evidence that there is 
>> a local file path behind. I'll give a couple of examples.
> 
> With NFS and other other remote file systems then you can never tell either. 
> Some of us have been wanting the ftp fallback go away, it comes up every few 
> years.
> 
> My concern is creating dependency on a protocol handler implementation, we 
> should make sure that all other options are explored before going there.

Retrieving a remote file while initializing the Security class is not ideal in 
terms of latency, but it's a user decision in the end. In fact, UNC paths may 
be remote and we handle them. The same is true for NFS, as you said. Our 
concern is not files being remote per-se but the fact that resolving a relative 
path to a base has to be well defined, which is possible when the base is also 
a Path. Incidentally, URLs with schemes such as _https_ are also confusing 
because the security properties used for the TLS connection would depend on 
where you include the file.

With that said, we also see the benefits of removing the dependency on 
FileURLConnection. We will make the change, just wanted to note that 
Path.of(URI.create(...)) will be sub-optimal in terms of what could be seen as 
a Path and, thus, in relative includes support. In concrete, we identified the 
following cases:

1) File URLs such as `file://localhost/absolute/path`
 1.1) In Linux, Path.of(URI.create(...)) throws an IllegalArgumentException. 
 1.2) In Windows, Path.of(URI.create(...)) is handled as an UNC path when, per 
RFC 8089, should have been a local one.
2) Non-conformant file URLs such as `file:///C|/Users/User` (Windows), 
`file:////?/C:/Users/User` (Windows), or `file://~/absolute/path` (Windows, 
Linux).
3) Non-conformant file URLs such as `file:relative/path`.
4) Non-conformant file URLs that have a query part, such as 
`file:///absolute/path?param=value`.

Notice that URL.openConnection() handles all of the previous cases, and returns 
a FileURLConnection with an underlying local File. For <span>#</span>1 we think 
that these could be issues in RFC 8089 conformance —particularly in Linux— that 
are worth exploring at some point (see sun.nio.fs.UnixUriUtils::fromUri and 
sun.nio.fs.WindowsUriSupport::fromUri). For <span>#</span>2, <span>#</span>3 
and <span>#</span>4 we can add an extra check and do some massaging but, as 
these are non-conformant file URLs, it is not a big deal.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16483#discussion_r1382111155

Reply via email to