> On Nov 6, 2017, at 05:19, Gabriele Svelto <gsve...@mozilla.com> wrote: > >> On 04/11/2017 01:10, Jeff Gilbert wrote: >> Clock speed and core count matter much more than ECC. I wouldn't chase >> ECC support for general dev machines. > > The Xeon-W SKUs I posted in the previous thread all had identical or > higher clock speeds than equivalent Core i9 SKUs and ECC support with > the sole exception of the i9-7980XE which has slightly higher (100MHz) > peak turbo clock than the Xeon W-2195. > > There is IMHO no performance-related reason to skimp on ECC support > especially for machines that will sport a significant amount of memory. > > Importance of ECC memory is IMHO underestimated mostly because it's not > common and thus users do not realize they may be hitting memory errors > more frequently than they realize. My main workstation is now 5 years > old and has accumulated 24 memory errors; that may not seem much but if > it happens at a bad time, or in a bad place, they can ruin your day or > permanently corrupt your data. > > As another example of ECC importance my laptop (obviously) doesn't have > ECC support and two years ago had a single bit that went bad in the > second DIMM. The issue manifested itself as internal compiler errors > while building Fennec. The first time I just pulled again from central > thinking it was a fluke, the second I updated the build dependencies > which I hadn't done in a while thinking that an old GCC might have been > the cause. It was not until the third day with a failure that I realized > what was happening. A 2-hours long memory test showed me the second DIMM > was bad so I removed it, ordered a new one and went on to check my > machine. I had to purge my compilation cache because garbage had > accumulated in there, run an hg verify on my repo as well as verifying > all the installed packages for errors. Since I didn't have access to my > main workstation at the time I had wasted 3 days chasing the issue and > my workflow was slowed down by a cold compilation cache and a gimped > machine (until I could replace the DIMM). > > This is not common, but it's not rare either and we now have hundreds of > developers within Mozilla so people are going to run into issues that > can be easily prevented by having ECC memory. > > That being said ECC memory also makes machines less susceptible to > Rowhammer-like attacks and makes them detectable while they are happening. > > For a more in-depth reading on the matter I suggest reading "Memory > Errors in Modern Systems - The Good, The Bad, and The Ugly" [1] in which > the authors analyze memory errors on live systems over two years and > argue that SEC-DED ECC (the type of protection you usually get on > workstations) is often insufficient and even chipkill ECC (now common on > most servers) is not enough to catch all errors happening during real > world use. > > Gabriele > > [1] https://www.cs.virginia.edu/~gurumurthi/papers/asplos15.pdf >
The Xeon-W’s are basically the i9’s (both Skylake-X) with support for ECC, more vPRO, and AMT. The Xeon-W’s lack Turbo 3.0 (preferred core). However, Turbo 2.0 apparently reaches the same MHz, so I don’t think it matters much. There are some other differences with regards to PCIe lanes, chipset, etc. Another big difference is price. The Xeon’s cost a lot more. For building Firefox, the i9’s and Xeon-W are probably very similar (and is something we should test). It likely comes down to whether you want to pay a premium for ECC and other Xeon-W features. I’m not in a position to answer that. _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform