On Thu, Apr 24, 2014 at 4:20 PM, Benoit Jacob <jacob.benoi...@gmail.com> wrote: > 2014-04-24 8:31 GMT-04:00 Henri Sivonen <hsivo...@hsivonen.fi>: > >> I have prepared a queue of patches that removes Netscape-era (circa >> 1999) internationalization code that efforts to implement the Encoding >> Standard have shown unnecessary to have in Firefox. This makes libxul >> on ARMv7 smaller by 181 KB, so that's a win. > > Have we measured the impact of this change on actual memory usage (as > opposed to virtual address space size) ?
No, we haven't. I don't have a B2G phone, but I could give my whole patch queue in one diff to someone who wants to try. > Have we explored how much this problem could be automatically helped by the > linker being smart about locality? Not to my knowledge, but I'm very skeptical about getting these benefits by having the linker be smart so that the dead code ends up on memory pages that aren't actually mapped to real RAM. The code that is no longer in use is sufficiently intermingled with code that's still is in use. Useful and useless plain old C data is included side-by-side. Useful and useless classes are included next to each other in unified compilation units. Since the classes are instantiated via XPCOM, a linker that's unaware of XPCOM couldn't tell that some classes are in use and some aren't via static analysis. All of them would look equally dead or alive depending on what we do you take on the root of the caller chain being function pointers in a contract ID table. Using PGO to determine what's dead code and what's not wouldn't work, either, if the profiling run was "load mozilla.org", because the run would exercise too little code, or if the profiling run was "all the unit tests", because the profiling run would exercise too much code. On Fri, Apr 25, 2014 at 2:03 AM, Ehsan Akhgari <ehsan.akhg...@gmail.com> wrote: >> * Are we building and shipping dead code in ICU on B2G? > > No. That is at least partly covered by bug 864843. Using system ICU seems wrong in terms of correctness. That's the reason why we don't use system ICU on Mac and desktop Linux, right? For a given phone, the Android base system practically never updates, so for a given Firefox version, the Web-exposed APIs would have as many behaviors as there are differing ICU snapshots on different Android versions out there. As for B2G, considering that Gonk is supposed to update less often than Gecko, it seems like a bad idea to have ICU be part of Gonk rather than part of Gecko on B2G. > In my experience, ICU is unfortunately a hot potato. :( The real blocker > there is finding someone who can tell us what bits of ICU _are_ used in the > JS engine. Apart from ICU initialization/shutdown, the callers seem to be http://mxr.mozilla.org/mozilla-central/source/js/src/builtin/Intl.cpp and http://mxr.mozilla.org/mozilla-central/source/js/src/jsstr.cpp#852 . So the JS engine uses: * Collation * Number formatting * Date and time formatting * Normalization It looks like the JS engine has its own copy of the Unicode database for other purposes. It seems like that should be unified with ICU so that there'd be only one copy of the Unicode database. Additionally, we should probably rewrite nsCollation users to use ICU collation and delete nsCollation. Therefore, it looks like we should turn off (if we haven't already): * The ICU LayoutEngine. * Ustdio * ICU encoding converters and their mapping tables. * ICU break iterators and their data. * ICU transliterators and their data. http://apps.icu-project.org/datacustom/ gives a good idea of what there is to turn off. > The parts used in Gecko for <input type=number> are pretty > small. And of course someone needs to figure out the black magic of > conveying the information to the ICU build system. So it looks like we already build with UCONFIG_NO_LEGACY_CONVERSION: http://mxr.mozilla.org/mozilla-central/source/intl/icu/source/common/unicode/uconfig.h#264 However, that flag is misdesigned in the sense that it considers US-ASCII, ISO-8859-1, UTF-7, UTF-32, CESU-8, SCSU and BOCU-1 as non-legacy, even though, frankly, those are legacy, too. (UTF-16 is legacy also, but it's legacy we need, since both ICU and Gecko are UTF-16 legacy code bases!) http://mxr.mozilla.org/mozilla-central/source/intl/icu/source/common/unicode/uconfig.h#267 So I guess the situation isn't quite as bad as I thought. We should probably set UCONFIG_NO_CONVERSION to 1 and U_CHARSET_IS_UTF8 to 1 per: http://mxr.mozilla.org/mozilla-central/source/intl/icu/source/common/unicode/uconfig.h#248 After all, we should easily be able to make sure that we don't use non-UTF-8 encodings when passing char* to ICU. Also, If the ICU build system is an configurable enough, I think we should consider identifying what parts of ICU we can delete even though the build system doesn't let us to and then automate the deletion as a script so that it can be repeated with future imports of ICU. >> * Do we have any mechanisms in place for preventing stuff like the >> ICU encoding converters becoming part of the building the future? > > No, that is not possible to automate. I was thinking of policy / review solutions. >> * How should we identify code that we build but that isn't used >> anywhere? > > I'm afraid we need humans for that. Yeah, but how do we get humans to do that? -- Henri Sivonen hsivo...@hsivonen.fi https://hsivonen.fi/ _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform