> On Nov 6, 2017, at 05:19, Gabriele Svelto <gsve...@mozilla.com> wrote:
> 
>> On 04/11/2017 01:10, Jeff Gilbert wrote:
>> Clock speed and core count matter much more than ECC. I wouldn't chase
>> ECC support for general dev machines.
> 
> The Xeon-W SKUs I posted in the previous thread all had identical or
> higher clock speeds than equivalent Core i9 SKUs and ECC support with
> the sole exception of the i9-7980XE which has slightly higher (100MHz)
> peak turbo clock than the Xeon W-2195.
> 
> There is IMHO no performance-related reason to skimp on ECC support
> especially for machines that will sport a significant amount of memory.
> 
> Importance of ECC memory is IMHO underestimated mostly because it's not
> common and thus users do not realize they may be hitting memory errors
> more frequently than they realize. My main workstation is now 5 years
> old and has accumulated 24 memory errors; that may not seem much but if
> it happens at a bad time, or in a bad place, they can ruin your day or
> permanently corrupt your data.
> 
> As another example of ECC importance my laptop (obviously) doesn't have
> ECC support and two years ago had a single bit that went bad in the
> second DIMM. The issue manifested itself as internal compiler errors
> while building Fennec. The first time I just pulled again from central
> thinking it was a fluke, the second I updated the build dependencies
> which I hadn't done in a while thinking that an old GCC might have been
> the cause. It was not until the third day with a failure that I realized
> what was happening. A 2-hours long memory test showed me the second DIMM
> was bad so I removed it, ordered a new one and went on to check my
> machine. I had to purge my compilation cache because garbage had
> accumulated in there, run an hg verify on my repo as well as verifying
> all the installed packages for errors. Since I didn't have access to my
> main workstation at the time I had wasted 3 days chasing the issue and
> my workflow was slowed down by a cold compilation cache and a gimped
> machine (until I could replace the DIMM).
> 
> This is not common, but it's not rare either and we now have hundreds of
> developers within Mozilla so people are going to run into issues that
> can be easily prevented by having ECC memory.
> 
> That being said ECC memory also makes machines less susceptible to
> Rowhammer-like attacks and makes them detectable while they are happening.
> 
> For a more in-depth reading on the matter I suggest reading "Memory
> Errors in Modern Systems - The Good, The Bad, and The Ugly" [1] in which
> the authors analyze memory errors on live systems over two years and
> argue that SEC-DED ECC (the type of protection you usually get on
> workstations) is often insufficient and even chipkill ECC (now common on
> most servers) is not enough to catch all errors happening during real
> world use.
> 
> Gabriele
> 
> [1] https://www.cs.virginia.edu/~gurumurthi/papers/asplos15.pdf
> 

The Xeon-W’s are basically the i9’s (both Skylake-X) with support for ECC, more 
vPRO, and AMT. The Xeon-W’s lack Turbo 3.0 (preferred core). However, Turbo 2.0 
apparently reaches the same MHz, so I don’t think it matters much. There are 
some other differences with regards to PCIe lanes, chipset, etc.

Another big difference is price. The Xeon’s cost a lot more.

For building Firefox, the i9’s and Xeon-W are probably very similar (and is 
something we should test). It likely comes down to whether you want to pay a 
premium for ECC and other Xeon-W features. I’m not in a position to answer that.
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to