Re: HTTP client API

Chris Hegarty Mon, 21 Nov 2016 06:26:29 -0800

Tobias,

If you look at the code in the sandbox [*], the notion of a default
client has been removed. Having global static default instances is
problematic. Http Clients are lightweight, easy to configure and
pass around, if that is what you want.


-Chris.

[*] hg clone http://hg.openjdk.java.net/jdk9/sandbox; cd sandbox
    bash common/bin/hgforest.sh update http-client-branch


On 21/11/16 11:28, Tobias Thierer wrote:

Replying to my own first email in this thread to add another question /
concern about flexibility of the HTTP Client API:

Have you considered offering applications a way to globally replace the
HTTP Client implementation with their own (e.g. via a method
HttpClient.setDefault() to go with the existing method
HttpClient.getDefault())?
This feature seems to currently be missing from the API while even the
old HttpURLConnection API supported this (via
URL.setURLStreamHandlerFactory()
<https://docs.oracle.com/javase/8/docs/api/java/net/URL.html#setURLStreamHandlerFactory-java.net.URLStreamHandlerFactory->).

I'm aware that such global (process-wide) configuration options have
disadvantages as well, but it would be consistent with other swappable
Java APIs/SPIs such as SSLSocketFactory, XML parsers, CookieHandler,
etc.  The advantage would be that applications / application frameworks
could swap out the HttpClient implementation used by lower level
libraries / applications that they're bundling in binary form - e.g. to
globally provide a fake implementation to use during testing, instrument
HTTP Client usage (e.g. log all URLs accessed or count the number of
bytes transferred), to adhere to requirements in some corporate
networks, or to otherwise decouple the version/implementation of the
HTTP Client from the platform / OpenJDK version. What's your view?

Tobias


On Mon, Oct 24, 2016 at 8:33 PM, Tobias Thierer <[email protected]
<mailto:[email protected]>> wrote:

    Hi Michael and others -


    Thanks for publishing the latest HTTP client API docs
    <http://cr.openjdk.java.net/~michaelm/httpclient/api/>(already
    slightly outdated again), as well as for publishing the current
    draft code in the sandbox repository!


    Below is some concrete feedback, questions and brainstorming on how to

    (a) increase the usefulness or

    (b) decrease the semantic weight

    of the API. Note that most of this is driven only by inspection of
    the API and some brief exploration of the implementation code, not
    (yet) by a substantial effort to write proof of concept client
    applications. I’d love if I could help make this API as useful to
    applications as possible, so I’d appreciate your feedback on how I
    can best do that and what the principles were that guided your
    design choices.


    1.) The HttpRequest.BodyProcessor
    
<http://cr.openjdk.java.net/~michaelm/httpclient/api/java/net/http/HttpRequest.BodyProcessor.html>and
    HttpResponse.BodyProcessor
    
<http://cr.openjdk.java.net/~michaelm/httpclient/api/java/net/http/HttpResponse.BodyProcessor.html>abstractions
    seem particularly hard to grasp / have a high semantics weight.

      *

        What purpose does the abstraction of a BodyProcessoraim to
        fulfill beyond what the (simpler) abstraction of a Bodycould be?

          o

            Instead of describing the abstraction as a “processor” of
            ByteBuffers / Java objects, wouldn’t it be simpler to say to
            say that request / response bodiesare ByteBuffer / Java
            object sources/ sinks? What is the advantage of the
            Publisher<ByteBuffer> / Subscriber<ByteBuffer> API over
            plain old InputStream / OutputStream based APIs?

          o

            The term “processor” and the description of “converting
            incoming buffers of data to some user-defined object type T”
            is especially confusing (increases the semantic weight of
            the abstraction) given that there is an implementation that
            discards all data
            
<http://cr.openjdk.java.net/~michaelm/httpclient/api/java/net/http/HttpResponse.BodyProcessor.html#discard-U->(and
            its generic type is called Urather than T). A BodyProcessor
            that has no input but generates the digits of Pi is also
            conceivable. Perhaps call these BodySource / BodySink,
            ByteBufferPublisher / ByteBufferSubscriber, or just Body?

          o

            The fact that you felt the need to introduce an abstraction
            HttpResponse.BodyHandler whose name is similar to but whose
            semantics are different from HttpResponse.BodyProcessor is
            another indication that these concepts could be clarified
            and named better.

          o

            To explore how well the abstractions “fit”, I played with
            some draft code implementing the API on top of another one;
            one thing I found particularly challenging was the control
            flow progression:
            HttpClient.send(request, bodyHandler)
            -> bodyProcessor = bodyHandler.apply(); // called by the library
            -> bodyProcessor.onSubscribe() / onNext()
            because it is push based and forces an application to
            relinquish control to the library rather than pulling data
            out of the library.
            Perhaps the Response BodyHandler abstraction could be
            eliminated altogether? For example, wouldn’t it be
            sufficient to abort downloading the body once an application
            thread has a chance to look at the Response object? Perhaps
            once a buffer is full, the download of the further response
            body could be delayed until a client asks for it?

          o

            What’s the purpose of HttpRequest.bodyProcessor()’s return
            type being an Optional<BodyProcessor> (rather than
            BodyProcessor)? Why can’t this default to an empty body?

          o

            Naming inconsistency: HttpRequest.BodyProcessor.fromFile()
            vs. HttpResponse.BodyProcessor.asFile(). How about calling
            all of these of(), or alternatively renaming asFile() ->
            toFile() or toPath()?

          o

            asByteArrayConsumer(Consumer<Optional<byte[]>> consumer):
            Why is this an Optional? What logic decides whether an empty
            response body will be represented as a present byte[0] or an
            absent value?


    2.) HttpHeaders: I love that there is a type for this abstraction! But:

      *

        Why is the type an interface rather than a concrete, final
        class? Since this is a pure value type, there doesn’t seem to be
        much point in allowing alternative implementations?

      *

        The documentation should probably specify what the methods do
        when nameis not valid (according to RFC 7230 section 3.2?), or
        is null.

      *

        Do the methods other than map() really pull their weight
        (provide enough value relative to the semantic API weight that
        they introduce)?

          o

            firstValueAsLong() looks particularly suspect: why would
            anyone care particularly about long values? Especiallysince
            the current implementation seems to throw
            NumberFormatException rather than returning an empty Optional?


    3.) Redirect

      *

        Stupid question: Should Redirect.SAME_PROTOCOL be called
        SAME_SCHEME (“scheme” is what the “http” part is called in a
        URL)? I’m not sure which one is better.

      *

        I haven’t made up my mind about whether the existing choices are
        the right ones / sufficient. Perhaps if this class used the
        typesafe enum pattern from Effective Java 1st edition rather
        than being an actual enum, the API would maintain the option in
        a future version to allow client-configured Redirect policies,
        allowing Redirect for URLs as long as they are within the same
        host/domain?


    4.) HttpClient.Version:

      *

        Why does a HttpClient need to commit to using one HTTP version
        or the other? What if an application wants to use HTTP/2 for
        those servers that support it, but fall back to HTTP/1.1 for
        those that don’t?


    5.) CookieManager

      *

        Is there a common interface we could add without making the API
        much more complex to allow us both RFC 2965 (outdated,
        implemented by CookieManager) and RFC 6265 (new, real world,
        actually used) cookies? Needs prototyping. I think it’s likely
        we’ll be able to do something similar to OkHttp’s CookieJar
        <https://square.github.io/okhttp/3.x/okhttp/okhttp3/CookieJar.html>which
        can be adapted to RFC 2965 - not 100%, but close enough that
        most implementations of CookieManager could be reused by the new
        HTTP API, while still taking advantage of RFC 6265 cookies.


    6.) HttpClient.Executor

      * The documentation isn’t very clear about what tasks run on this
        executor and how a client can control HTTP traffic through a
        custom Executor instance. What power does the current executor()
        API provide to clients? Perhaps it would be simpler to omit this
        API altogether until the correct API becomes clearer?


    Thanks!

    Tobias

Re: HTTP client API

Reply via email to