Re: [blink-dev] Intent to Experiment: Load common payloads from privacy-preserving single-keyed cache

Mike Taylor Tue, 26 Apr 2022 13:55:37 -0700

Hi Daisuke,

Can you clarify the timeline of the experiment? Would it begin in M103?I have concerns about interactions with the current double-keyexperiment<https://groups.google.com/a/chromium.org/g/blink-dev/c/WQtp7Ixd1RU>we're running for Network State Partitioning in M101 and M102.


On 4/26/22 7:59 AM, Daisuke Enomoto wrote:

        Contact emails

[email protected], [email protected], [email protected]


        Explainer
https://docs.google.com/document/d/1pvaMg7J5beBXD7trzHJH_MDULc_wRHLx40MFYAmjknE/edit<https://docs.google.com/document/d/1pvaMg7J5beBXD7trzHJH_MDULc_wRHLx40MFYAmjknE/edit>
        Specification

N/A (because there are no web-exposed changes)


        Summary
This limited experiment measures how much "pervasive payloads"contribute to the performance impact of the split HTTP cache in eachChrome channel over a three-week period. Pervasive payloads are thosethird-party payloads included on at least 500 sites and fetched atleast 10M times in a month, based on Chrome's analysis (payload listincluded below). This experiment further measures the impact on CoreWeb Vitals metrics of restoring pervasive payloads (and only pervasivepayloads) to a single-keyed cache regime. The privacy benefits of thesplit HTTP cache are preserved.
        Blink component
Blink>Network<https://bugs.chromium.org/p/chromium/issues/list?q=component:Blink%3ENetwork>
        Motivation
Browsers split HTTP caches based on the top-frame visited origin(“double-keyed” or "triple-keyed" caching) to prevent sites fromtracking users via a timing attack on a cross-site client cache.
Chrome’s analysis estimates that split caching results in a 3%increase in cache misses, i.e. fetches for which a payload exists inthe cache of the user's device, but is unavailable to the page becauseit was fetched by the user while loading a page from a differentorigin. This results in approximately 4% more total bytes beingfetched over the network.
Our analysis further revealed that many of the redundant fetchescaused by split caching were for common payloads associated withdisplaying user content (libraries, fonts, widgets, ads) or commonpayloads that assist in operating online businesses (analytics). Thedelayed arrival of these common payloads resulted in visible "jank"for users, impacting performance metrics like LCP<https://web.dev/lcp>, FCP <https://web.dev/fcp>and CLS<https://web.dev/cls/>. This jank has been associated with negativeeffects to online business' engagement and conversion rates.Furthermore, delayed loads of analytics and ads payloads can result inmissed ads impressions and dropped analytics hits.
        Initial public proposal
This experiment sends a list to Chrome of 100 <URL, hash> pairs whosepayloads are considered pervasive (the "pervasive payloads list").During the three-week experiment period, if Chrome fetches a payloadthat matches both the URL and its hash on the pervasive payloads list,it is inserted into a local single-keyed cache. This payload is thenavailable for use by Chrome when loading pages on other sites thatinclude the matching URL. All other fetches for URLs not on thepervasive payloads list are cached according to the existing splitHTTP cache.
The hash covers the payload body and most response headers, except forthose which change on every response.
To ensure we do not degrade the privacy profile of any users duringthis experiment, only users with third-party cookies currently enabledwill be eligible for the experiment. We will compare the experience ofusers in experiment and control arms according to total bytes loadedand page performance metrics like the Core Web Vitals<https://web.dev/vitals>.
The pervasive payloads list was produced by crawling the web andaggregating the most commonly referenced third-party resource URLsincluded in HTML content. We then used pseudonymous URL-keyed metricsfrom Chrome to estimate the traffic to sites and the number ofimpressions of third-party resources. Individually identifiablebrowsing or search histories were not used in the creation of thepervasive payload list (for more information about Chrome's datacollection policies and privacy policies, seegoogle.com/chrome/privacy <https://google.com/chrome/privacy>). Theresulting list was further filtered for any URLs that might containPII (e.g. URLs with extensive or opaque query parameters). The listwas also manually reviewed to ensure it included only payloadsreasonably expected to be pervasive; the manual review did not resultin any payloads being removed.
The privacy properties of the split HTTP cache are consideredessential to users and this proposal intends to maintain thoseproperties, specifically by maintaining split HTTP caching for allpayloads not on the pervasive payloads list.
API semantics are unchanged. User-facing functionality is unchanged(though we expect performance to be slightly improved).
The list of the top 100 Pervasive URLs for use in this experiment ispending internal reviews and will be shared on this thread upon approval.
        Future directions
This experiment is the first step in a path exploring improvedhandling of pervasive payloads in the browser cache. We outline theintended future functionality here to clarify the intentions behindthe current experiment. The overview below is not complete or finaland subsequent parts of the design and implementation will bepresented and discussed in further Intents to Experiment and Prototype.
At a high level, a future improvement to the handling of pervasivepayloads may involve:
 *

    Assembling a list of pervasive payloads that meets the following
    criteria:

     o

        Maintains the privacy of user browsing histories in its creation

     o

        Fairly represents pervasive payloads as they have been chosen
        by developers on the web, not payloads selected or favored by
        any particular library or browser vendor.

         +

            This experiment will initially use a static list of
            predefined URLs assembled as described in the 'Initial
            public proposal' section above

         +

            A future implementation will likely dynamically update the
            payloads list on, for example, a weekly cadence.

 *

    Implementing shared caching for pervasive payloads that meets the
    following criteria:

     o

        Materially improves load times and responsiveness for web
        users(under study in this experiment)

     o

        Does not create a new tracking vector based on cache timing
        attacks

     o

        Does not require users to fetch payloads before the browser
        knows they will need it (i.e. we don't plan to bundle payloads
        with browser installs or updates)

     o

        Does not increase local storage required by browser installs
        or caches
To privately and fairly assemble the list of pervasive payloads, weare exploring the use of Private Heavy Hitters<https://www.tensorflow.org/federated/tutorials/private_heavy_hitters>.To implement a privacy-preserving shared cache after the deprecationof third-party cookies, we are exploring adding a measure ofrandomness to the observed presence or absence of a pervasive payloadin the shared cache.
However, this work is only worthwhile if it results in materiallyimproved load times for real users. This Intent to Experiment coversonly whether or not we should attempt to measure the performance gainsthat might be realized ifpervasive payloads were placed in a sharedcache, as one data point among others that will contribute todiscussions about future steps for the proposal.
        TAG review

None yet.


        TAG review status

N/A


        Risks


        Interoperability and Compatibility
Chrome's compliance with the relevant standards is unchanged. Cachingbehavior differs between browsers so interoperability will not beaffected.
The list of popular payloads is specifically chosen to minimizecompatibility risks.
Gecko: No signal


WebKit: No signal


Web developers: No signals


Other signals:


        WebView application risks
Does this intent deprecate or change behavior of existing APIs, suchthat it has potentially high risk for Android WebView-basedapplications? No
        Debuggability
There is no developer-exposed API for this feature, so most DevToolssupport is not relevant. It would be useful to indicate whether aresource was served from the single-keyed cache in the network tab,however this will not be implemented in the initial experiment.
Security and privacy
Single-keyed caching introduces global state shared between differentbrowsing contexts. A shared cache can introduce information leaksbased on cache probing(https://xsleaks.dev/docs/attacks/cache-probing/<https://xsleaks.dev/docs/attacks/cache-probing/>), includingXS-Search (https://xsleaks.dev/docs/attacks/xs-search/<https://xsleaks.dev/docs/attacks/xs-search/>) in applications whichconditionally load a single-keyed-cache eligible resource based onauthenticated user state. The state of the cache, queried acrossdifferent contexts, could also be used as a fingerprint, permittinguser tracking; however, in this case, we believe this does not providetracking capabilities beyond those of third-party cookies.
To protect users during this experiment, we limit the experimentpopulation to those users with third-party cookies enabled.Recognizing that third-party cookies will eventually be switched offfor most users <https://privacysandbox.com/>, we are developingprotections such as slightly randomizing cache hit/miss checks,disallowing eviction, or guaranteeing attempts to read from the cachereliably populate that cache entry. These protections will be proposedand incorporated before any future optimizations are launched.
For the purposes of the current experiment, we will be using the sameimplementation of single-keyed caching that Chrome used before theHTTP cache was partitioned in M77(https://chromestatus.com/feature/5730772021411840<https://chromestatus.com/feature/5730772021411840>).
To summarize, the security and privacy restrictions on this experimentare as follows:
1.

    We will exclude users that have third-party cookies disabled.

2.

    Only a small percentage of users will be included in the
    experiment, reducing the likelihood and impact of any attacks
    abusing the single-keyed cache.

3.

    We will strictly limit the duration of the experiment on each
    channel to 3 weeks.

4.

    We will only serve pervasive resources from the single-keyed cache.

5.

    We can turn off the experiment immediately (independent of browser
    updates) if any other threats appear.


        Is this feature fully tested by web-platform-tests
        
<https://chromium.googlesource.com/chromium/src/+/master/docs/testing/web_platform_tests.md>?
This behavior is specific to Chrome and not part of any standard, soit will not be tested in web platform tests.
        Flag name

CacheTransparency


        Requires code in //chrome?
No, but the list of popular payloads and the mechanism fordistributing it to the browser will be Chrome-specific.
        Tracking bug
https://bugs.chromium.org/p/chromium/issues/detail?id=1309002<https://bugs.chromium.org/p/chromium/issues/detail?id=1309002>
        Launch bug
https://bugs.chromium.org/p/chromium/issues/detail?id=1309353<https://bugs.chromium.org/p/chromium/issues/detail?id=1309353>
        Estimated milestones

M103 for off-by-default experiment


        Link to entry on the Chrome Platform Status
https://chromestatus.com/feature/5768521127559168<https://chromestatus.com/feature/5768521127559168>
--
You received this message because you are subscribed to the GoogleGroups "blink-dev" group.To unsubscribe from this group and stop receiving emails from it, sendan email to [email protected].To view this discussion on the web visithttps://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAA5e6990s-e4aYUnYK5%2BqzQpAyFzJa42y%2B%3D_MAnL19z%3DqemnWg%40mail.gmail.com<https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAA5e6990s-e4aYUnYK5%2BqzQpAyFzJa42y%2B%3D_MAnL19z%3DqemnWg%40mail.gmail.com?utm_medium=email&utm_source=footer>.


--
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/432587f1-684f-af19-79ff-9c5514891999%40chromium.org.

Re: [blink-dev] Intent to Experiment: Load common payloads from privacy-preserving single-keyed cache

Reply via email to