This is excellent news. Do you have any measurements showing perf effects?

Semi-relatedly, Swift 5 will change the preferred encoding of strings from
UTF-16 to UTF-8. Some readers might find the accompanying blog post
interesting: https://swift.org/blog/utf8-string/.

Nick

On Sat, 20 Jul 2019 at 10:05, Jeff Walden <jwal...@mit.edu> wrote:

> # Intent to ship: UTF-8 parsing of external <script>s and worker scripts
>
> ## Introduction
>
> JS acts on 16-bit code units (UTF-16 with lone surrogates permitted),
> because 1990s.  💯  As a consequence, SpiderMonkey has long handled only
> 16-bit source text.  APIs taking `const char*` or `const JS::Latin1Char*`
> or similar just inflated to UTF-16 and processed that.
>
> Now, UTF-8 is ubiquitous.  And scripts have lots of ASCII keywords and
> operators compactly represented in UTF-8.
>
> I've been making SpiderMonkey natively handle UTF-8 source text, using
> lots of templates and template specializations.  (*Only* valid UTF-8: any
> invalidity, including for WTF-8, is an immediate error, no
> replacement-character semantics applied.)  Uncompressed UTF-8 typically
> requires half the bytes and processing of UTF-16.  And compressed UTF-8 is
> also generally smaller than compressed UTF-16, because compressors needn't
> devote bandwidth to lots of null bytes.
>
> Since late May, DOM workers' accumulated UTF-8 data is directly parsed as
> UTF-8 in nightly builds.  Since mid-June, external <script> data is also
> accumulated and then directly parsed as UTF-8 in nightly builds
> (pref-controlled, in both cases).  (I haven't changed inline <script>s:
> they're often small and aren't stored as UTF-8, so benefits there are less
> clear.)  Bugs found and created during the entire effort were all readily
> fixed.  No bugs have been reported on the pref-flips -- or at least you
> haven't reported any.  ;-)
>
> So it seems like a good time to set the pref to flatly true in nightly and
> beta.  The full month and a half remaining til next merge ought be plenty
> of time for the beta audience to suss out any remaining issues for fixing.
>
> ## Tracking bugs
>
> https://bugzilla.mozilla.org/show_bug.cgi?id=1543517 (to enable in beta)
> https://bugzilla.mozilla.org/show_bug.cgi?id=1543514 (to enable in
> release -- but if the prior bug is fixed, this will just happen naturally
> next uplift)
>
> ## Platform coverage
>
> All
>
> ## Estimated or target release
>
> 69, if all goes to plan
>
> ## Where to send your bugs
>
> Bugs in the JS side of this go here:
>
>
> https://bugzilla.mozilla.org/enter_bug.cgi?product=Core&component=JavaScript%20Engine
>
> Bugs in the DOM side, prior to invoking UTF-8 parsing, go here:
>
>
> https://bugzilla.mozilla.org/enter_bug.cgi?product=Core&component=DOM:%20Core%20%26%20HTML
>
> If you don't know which you have, file a JS bug and put a needinfo on me,
> and I'll move it to the right place.
>
> Jeff
> _______________________________________________
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to