With the landing of bug 853301, we are now shipping ICU in desktop Firefox builds. This costs us about 10% in both download and on-disk footprint: see https://bugzilla.mozilla.org/show_bug.cgi?id=853301#c2. After a discussion with Waldo, I'm going to post some details here about how much this costs in terms of disk footprint, to discuss whether there are things we can remove from this footprint, and whether the footprint is actually worth the cost. This is particularly important because our user research team has identified Firefox download weight as an important factor affecting Firefox adoption and update rates in some markets.

On-disk, ICU data breaks into the following categories:

* collation tables - 3.3MB

These are rules for sorting strings in multiple languages and situations. See http://userguide.icu-project.org/collation for basic background. These tables are necessary for implementing Intl.Collator.

The Intl.Collator API has methods to expose a subset of languages. It is not clear from my reading of the specification whether it is expected that browsers will normally ship with the full set of languages or only the subset of the browser locale.

* currency tables - 1.9 MB

These are primarily the localized name of each currency in each language. This is used by the Intl.NumberFormat API to format international currencies.

* timezone tables - 1.7MB

Primarily the name of every time zone in each language. This data is necessary for implementing Intl.DateTimeFormat.

* language data - 2.1 MB

This is a bunch of other data associated with displaying information for a particular language: number formatting in various long and short formats, calendar formats and names for the various world calendar systems.

==

Do we need this data for any language other than the language Firefox ships in? Can we just include the relevant language data in each localized build of Firefox, and allow users to get other language data via downloadable language packs, similarly to how dictionaries are handled?

Is it possible that some of this data (the collation tables?) should be in all Firefox locales, but other data (currency and timezone names) is not as important and we can ship it only in one language?

As far as I can tell, the spec allows user agents to ship whatever languages they need; the real question is what users and site authors actually need and expect out of the API. (I'm reading the spec out of http://wiki.ecmascript.org/doku.php?id=globalization:specification_drafts)

I am still working to get better number to quantify the costs in terms of lost adoption for additional download weight.

Also, we are currently duplicating the data tables on mac universal builds, because they are compiled-in symbols. We should clearly use a separate file for these tables to avoid unnecessary download/install weight. This is now filed as bug 926980.

--BDS

_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to