Hi Rick, Daniel, I'm looking at the case of a non-persistent and document-local dictionary that stores the word list in memory. Is it Okay to illustrate a bit more on why Blink>DOM might not be the right component for this? And what are the issues you could foresee on ensuring the data is reliably per-document?
Thank you! Ziran On Tuesday, 22 July 2025 at 22:02:41 UTC+1 Rick Byers wrote: > On Tue, Jul 22, 2025 at 3:18 PM Stephen Chenney <sche...@chromium.org> > wrote: > >> Regarding motivation, our client has financial data, such as >> stock symbols and company names. There are similar use cases for medical >> data, fan fiction, or anything else with words that might not appear in >> hunspell's dictionaries. It's conceivable that the Google internal spelling >> APIs have these words but clients may be very reluctant to send their >> strings to Google. >> >> The proposal in this intent is relatively straightforward to implement >> and privacy and security is relatively simple to assess. But for developers >> there will probably be significant load time costs around it, to fetch the >> site's dictionary and process it to add the words. >> > > I'd love to see some figures on this. Maybe a bulk add API would be > enough? As a quick example I picked a random website (bloomberg.com) and > found it downloaded 3.4MB compressed including a number of individual > scripts, images and JSON blobs which were around 100kB compressed each. In > contrast the entire american-english dictionary on my linux machine > compresses down to 270kB. So as long as we're talking about something > that's less than 10% the size of the whole american english dictionary, my > hunch is that the transfer cost will be insignificant and lost in the > noise. But still an http approach to at least enable caching would be a > good idea with little downside. I could imagine, for example, a <link > rel=dictionary> tag or something that would be even simpler than this JS > API approach? > > Anyway this is just random thoughts to try to nudge away from premature > optimization, not API owner input or anything :-). > > We have some ideas around that in future work but nothing concrete. I >> think we'll have to address it before we ship. >> >> A HTTP header approach would make the ergonomics easier (assuming the >> infrastructure for setting up a spelling server is reasonably standard) and >> fits better into the existing code, But ti would not work offline. Maybe >> the approaches are complementary and we do both. >> >> I'll try to get some idea on the size of typical dictionaries in this >> space. It is important to know, >> >> Cheers, >> Stephen. >> >> On Tue, Jul 22, 2025 at 12:03 PM Rick Byers <rby...@chromium.org> wrote: >> >>> Spelling server seems a lot harder to get right to me, obviously more to >>> worry about regarding privacy etc. Can you share anything more about the >>> motivating use cases here? Like how large do these custom dictionaries tend >>> to be? I'd guess that for even dictionaries up to 1MB compressed it's >>> probably faster and simpler to just have the client download the whole >>> thing. RTT latency is generally a bigger performance problem these days >>> than raw throughput. But if it's important to solve scenarios with really >>> large dictionaries then maybe it's worth exploring? >>> >>> On Tue, Jul 22, 2025 at 11:11 AM Stephen Chenney <sche...@chromium.org> >>> wrote: >>> >>>> Thanks for the early feedback, and sorry for the lack of clarity on the >>>> explainer. We're working on improving the explainer to address the issues >>>> raised here and issues raised on github. >>>> >>>> We're also considering an entirely different approach whereby a site >>>> provides a "spelling server" URL in the HTML header. That would operate >>>> more like the existing "send it to Google" spell checking options. We're >>>> super early in designing such a thing, but if anyone has early feedback on >>>> that approach we would be interested. >>>> >>>> Cheers, >>>> Stephen. >>>> >>>> On Tue, Jul 22, 2025 at 10:54 AM Rick Byers <rby...@chromium.org> >>>> wrote: >>>> >>>>> FWIW I was also a little confused reading the explainer, but I think I >>>>> understand the overall design and I think it's a good one: these >>>>> dictionaries are transient and document-local, simply a mechanism to let >>>>> pages selectively suppress spell check violations on their own page. >>>>> >>>>> Presumably discussion of network fetches in the explainer are just >>>>> about the app fetching from it's server (not fetches in the browser), and >>>>> all the discussions of "persistent" storage are under the "future work" >>>>> section so it's fine to me that there's no detail here (it's out of scope >>>>> because it's hard). I'm not sure whether it would make sense to extend >>>>> this >>>>> design into persistent storage or not, but I'm also not sure it matters >>>>> (as >>>>> the explainer says it's simply an optimization - a problem that may or >>>>> may >>>>> not exist in practice so not worth worrying about today). >>>>> >>>>> Ensuring the data is reliably per-document is definitely a key >>>>> implementation concern, so I agree with you there Daniel. And yes we'll >>>>> eventually want signals from other browser vendors, but our process >>>>> <https://www.chromium.org/blink/launching-features/> has that step >>>>> only after prototyping is complete (often we learn a lot about the design >>>>> from prototyping), so it's premature to ask for it now at I2P phase. >>>>> >>>>> Cheers, >>>>> Rick >>>>> >>>>> On Tue, Jul 22, 2025 at 7:37 AM 'Daniel Vogelheim' via blink-dev < >>>>> blin...@chromium.org> wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> This intent came up in security review, and I'm mostly confused: >>>>>> >>>>>> - The explainer mostly seems to assume that these are stored >>>>>> in-memory, per-document. But it also talks about absence of >>>>>> cross-origin-requests; only to add info about CORS, which only makes >>>>>> sense >>>>>> for cross-origin requests. >>>>>> - There are multiple references to loading data, but there is no >>>>>> explanation about what kind of network requests are being made when or >>>>>> where. >>>>>> - The explainer suggests "Persistently store data" as an optimization >>>>>> for having to re-load large dictionaries. Again, no information about >>>>>> which >>>>>> requests are being optimized away. >>>>>> - In "Data Storage" it is pointed out that CustomDictionaryEngine >>>>>> exists per renderer process. While renderer processes mostly don't have >>>>>> cross-origin data, they sometimes do. And they may hold multiple >>>>>> documents. >>>>>> This seems inconsistent with information being stored per-document. >>>>>> >>>>>> Non-security feedback: >>>>>> - Since this is a web-exposed API, I'd have expected some attempt at >>>>>> checking with other browser engines on support. >>>>>> - I do not understand the "High-level Architecture". It seems to >>>>>> feature a stack of methods that feeds into yes/no decisions which feeds >>>>>> into a storage thing. I have no idea what this is meant to convey. >>>>>> - Blink>DOM might not be the right component for this. >>>>>> >>>>>> >>>>>> Could you please update the documentation to be more clear about >>>>>> where data is stored, and about which network requests are being made? >>>>>> >>>>>> >>>>>> On Fri, Jul 18, 2025 at 12:08 PM Chromestatus < >>>>>> ad...@cr-status.appspotmail.com> wrote: >>>>>> >>>>>>> Contact emails ji...@igalia.com >>>>>>> >>>>>>> Explainer >>>>>>> https://github.com/Igalia/explainers/tree/main/dictionary-api >>>>>>> >>>>>>> Specification None >>>>>>> >>>>>>> Design docs >>>>>>> >>>>>>> https://github.com/Igalia/explainers/tree/main/dictionary-api#-proposal >>>>>>> >>>>>>> Summary >>>>>>> >>>>>>> The proposed APIs enable users to modify the document local >>>>>>> dictionary in the browser. Users can add, remove, and check words in >>>>>>> the >>>>>>> document local dictionary. This feature ensures the browser does not >>>>>>> mark >>>>>>> words in the document local dictionary as spelling errors. >>>>>>> >>>>>>> >>>>>>> Blink component Blink>DOM >>>>>>> <https://issues.chromium.org/issues?q=customfield1222907:%22Blink%3EDOM%22> >>>>>>> >>>>>>> >>>>>>> Motivation >>>>>>> >>>>>>> Some words need to be added to the document custom dictionary so >>>>>>> that the browser does not mark them as spelling errors. The added words >>>>>>> need to be removed at some point if they aren't necessary. Current >>>>>>> specs >>>>>>> such as element.spellcheck attribute and ::spelling-error CSS >>>>>>> pseudo-element manage the words already in the dictionary. Therefore, >>>>>>> the >>>>>>> new API would be needed to manipulate the document local dictionary. >>>>>>> >>>>>>> >>>>>>> Initial public proposal None >>>>>>> >>>>>>> TAG review None >>>>>>> >>>>>>> TAG review status Pending >>>>>>> >>>>>>> Risks >>>>>>> >>>>>>> >>>>>>> Interoperability and Compatibility >>>>>>> >>>>>>> None >>>>>>> >>>>>>> >>>>>>> *Gecko*: No signal >>>>>>> >>>>>>> *WebKit*: No signal >>>>>>> >>>>>>> *Web developers*: No signals >>>>>>> >>>>>>> *Other signals*: >>>>>>> >>>>>>> WebView application risks >>>>>>> >>>>>>> Does this intent deprecate or change behavior of existing APIs, such >>>>>>> that it has potentially high risk for Android WebView-based >>>>>>> applications? >>>>>>> >>>>>>> None >>>>>>> >>>>>>> >>>>>>> Debuggability >>>>>>> >>>>>>> None >>>>>>> >>>>>>> >>>>>>> Is this feature fully tested by web-platform-tests >>>>>>> <https://chromium.googlesource.com/chromium/src/+/main/docs/testing/web_platform_tests.md> >>>>>>> ? Yes >>>>>>> >>>>>>> third_party/blink/web_tests/wpt_internal/dom/local-dictionary/* >>>>>>> There is WIP patch which includes the tests >>>>>>> >>>>>>> >>>>>>> Flag name on about://flags None >>>>>>> >>>>>>> Finch feature name None >>>>>>> >>>>>>> Non-finch justification None >>>>>>> >>>>>>> Requires code in //chrome? False >>>>>>> >>>>>>> Tracking bug https://issues.chromium.org/issues/428005649 >>>>>>> >>>>>>> Estimated milestones >>>>>>> >>>>>>> No milestones specified >>>>>>> >>>>>>> >>>>>>> Link to entry on the Chrome Platform Status >>>>>>> https://chromestatus.com/feature/6185007701557248?gate=4503614776934400 >>>>>>> >>>>>>> This intent message was generated by Chrome Platform Status >>>>>>> <https://chromestatus.com>. >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "blink-dev" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to blink-dev+...@chromium.org. >>>>>>> To view this discussion visit >>>>>>> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/687a1d04.170a0220.2dad83.0168.GAE%40google.com >>>>>>> >>>>>>> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/687a1d04.170a0220.2dad83.0168.GAE%40google.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "blink-dev" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to blink-dev+...@chromium.org. >>>>>> To view this discussion visit >>>>>> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CALG6KPPzd95-XN%2BjWHLmvwjLg3wv6WjZWYvP52T6Rp%3DjEg_EVw%40mail.gmail.com >>>>>> >>>>>> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CALG6KPPzd95-XN%2BjWHLmvwjLg3wv6WjZWYvP52T6Rp%3DjEg_EVw%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>> -- You received this message because you are subscribed to the Google Groups "blink-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscr...@chromium.org. To view this discussion visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/f518ca75-46a7-4d6d-86c2-ee36c94c4c7fn%40chromium.org.