Thank you Amos for this enlightenment.
I really do appreciate your help.
I will stay with the reverse proxy configuration for our POC.
We need more to cache the libraries data reads than the writes at the moment.
And the next version of OneDrive client should help with the asynchronous 
writes.
Still, it will download from the Cloud so Squid is necessary in all cases.

Thank you.
Regards,
Olivier MARCHETTA


-----Original Message-----
From: squid-users [mailto:squid-users-boun...@lists.squid-cache.org] On Behalf 
Of Amos Jeffries
Sent: Sunday, September 10, 2017 6:25 PM
To: squid-users@lists.squid-cache.org
Subject: Re: [squid-users] Http write cache

On 10/09/17 21:14, Olivier MARCHETTA wrote:
> Hello,
> 
>> Origin servers can sometimes respond to requests with payload ("uploads") 
>> before the request has fully arrived, but any subsequent network issues are 
>> guaranteed to result in data loss - so the practice is discouraged.
> 
> If I understand, when it's a download (GET), Squid will replace the payload 
> with the object in cache, if fresh.

Nod. This is possible because two identical requests

> But the HTTP control messages are still coming from the Origin server.

Not necessarily. There are no "control messages" as such in HTTP. The cache 
controls are delivered along with the cached payload to indicate what can be 
done with it. Synchronous server contact (aka revalidation) to deliver 
responses is only required if those controls say so.


> In case of an upload (PUT), it won't accelerate to use the Squid 
> cache, because the client has to wait for the Origin server's response of the 
> payload transfer (or request).

Yes. Squid has never seen the request before, so has no idea what response will 
appear as a result.

> 
> The only option to make uploads faster is if the Origin server is aware that 
> the client is using a reverse proxy cache and respond to the upload request 
> before the full payload transfer.
> 

Close, bit not quite. The server does not need to know about the proxy, it just 
has to know the upload payload is "pointless waste of bandwidth" 
  (where data loss don't matter) and deliver its response early.

For example; this is usually seen with NTLM authentication, where uploads 
without credentials are denied early. Because the upload has to be repeated in 
full with the right credentials and all the bytes from the first attempt can be 
dropped in-transit by the proxy.


> Tell me if I'm wrong, but I think that I understand now.
> Meaning that if I want to "bufferize" the writes it has to happen with 
> another protocol before the WebDAV connection to Sharepoint Online.
> 

The "other protocol" is WebDAV as far as I know. HTTP is just about delivery of 
some request and its corresponding response. How WebDAV transfers use HTTP 
messaging, and which parts of HTTP and WebDAV the client and server implement 
may or may not support the behaviour you want.


You are then colliding with the definition differences between "cache" 
and "buffer". Caches store *past* data for the purpose of reducing 
current/future server work, buffers store *current* data awaiting delivery.
  An upload is normally not something seen previously, so not cacheable.

Proxies and the network itself *do* buffer data along the way. But that in no 
way adds any asynchronous properties to HTTP. The client still has to wait for 
the HTTP response to be delivered back to it before it can consider the HTTP 
part of that transaction over - the "transaction" in this context may or may 
not be the full WebDAV upload+processing on the server.

HTTP has some mechanisms that can help improve upload behaviour and avoid 
pointless bandwidth delivery. Notably the Expect:100-continue and Range 
features and 201/202 status codes. WebDAV extensions to HTTP add various other 
things I'm not very familiar with.
  Between them they can signal to the client a server is a) contactable before 
data gets delivered, b) deliver it in small chunks to minimize loss, and c) 
that any given part has completed arrival and awaiting some state (ie full 
object arrival) and/or some async processing.


BUT, as should be obvious these are all application-logic level things (ie 
WebDAV) and require explicit support by both the endpoint applications on 
server and client for that logic to take place. The async properties arise from 
how things are done *between* HTTP transactions. The interactions are separate 
synchronous request+response message pairs as far as Squid and any HTTP 
infrastructure is concerned.

Amos
_______________________________________________
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users
_______________________________________________
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users

Reply via email to