On Mon, Feb 09, 2015 at 08:51:02PM +1100, Bruce Evans wrote: > On Mon, 9 Feb 2015, Konstantin Belousov wrote: > > > On Mon, Feb 09, 2015 at 05:00:49PM +1100, Bruce Evans wrote: > >> On Mon, 9 Feb 2015, Oleksandr Tymoshenko wrote: > >> ... > >> I think the full bugs only occur when arch has strict alignment > >> requirements and the alignment of the __packed objects is not known. > >> This means that only lesser bugs occur on x86 (unless you enable > >> alignment checking, but this arguably breaks the ABI). The compiler > >> just generates possibly-misaligned full-width accesses if the arch > >> doesn't have strict alignment requirements. Often the acceses turn > >> out to be aligned at runtime. Otherwise, the hardware does them > >> atomically, with a smaller efficiency penalty than split accesses. > > > > On x86 unaligned access is non-atomic. This was very visible on > > Core2 CPUs where DPCPU code mishandled the alignment, resulting in > > the mutexes from the per-cpu areas breaking badly. > > > > Modern CPUs should not lock several cache lines simultaneously either. > > Interesting. I thought that this was relatively easy to handle in > hardware and required for compatibility, so hardware did it. Trying to lock to cache lines easily results in deadlock. FWIW, multi-socket Intel platforms are already deadlock-prone due to the cache, and have some facilities to debug this.
> > This gives a reason other than efficiency to enable alignment checking > so as to find all places that do misaligned accesses. I last tried this > more than 20 years ago. Compilers mostly generated aligned accesses. > One exception was for copying small (sub)structs. Inlining of the copy > assumed maximal alignment or no alignment traps. Library functions are > more of a problem. FreeBSD amd64 and i386 memcpy also assume this. > Similarly for the MD mem* in the kernel. Mostly things are suitably > aligned, so it is the correct optimization to not do extra work to align. I also did experiments with preloadable dso which sets EFLAGS.AC bit. Last time I tried, it broke in the very early libc initialization code, due to unaligned access generated by compiler, as you described. This was with in-tree gcc. Tried with the clang-compiled world, I got SIGBUS due to unaligned access in ld-elf.so.1. AC does not work in ring 0, and Intel re-purposed the bit for kernel recently for 'security' theater. _______________________________________________ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"