Thanks, it sounds like I fixed a bug, but there’s more.

What were the specific port so I can test it here?

And to be clear, this is a buildworld on the RPi 2 using the cross-built world 
with CPUTYPE=armv7a or some such, right?

Warner

> On Dec 25, 2015, at 9:32 PM, Mark Millard <[email protected]> wrote:
> 
> [I am again breaking off another section of older material.]
> 
> Mixed news I'm afraid.
> 
> The specific couple of ports that I attempted did build, the same ones that 
> originally got the Bus Error in ar using (indirectly) _fseeko and memset that 
> I reported. So I expect that you fixed one error.
> 
> But when I tried to buildworld, clang++ 3.7 processing 
> usr/src/lib/clang/libllvmtablegen/ materials quickly got a Bus Error at 
> nearly the same type of instruction (it has a "!" below that the earlier one 
> did not), but with r4 holding the misaligned address this time:
> 
>> --- _bootstrap-tools-lib/clang/libllvmsupport ---
>> --- APFloat.o ---
>> clang++: error: unable to execute command: Bus error (core dumped)
>> . . .
>> # gdb clang++ usr/src/lib/clang/libllvmtablegen/clang++.core
>> . . .
>> Core was generated by `clang++'.
>> Program terminated with signal 10, Bus error.
>> #0  0x00c3bb9c in 
>> clang::DependentTemplateSpecializationType::DependentTemplateSpecializationType
>>  ()
>> [New Thread 22a18000 (LWP 100128/<unknown>)]
>> (gdb) x/40i 0x00c3bb60
>> . . .
>> 0xc3bb9c 
>> <_ZN5clang35DependentTemplateSpecializationTypeC2ENS_21ElaboratedTypeKeywordEPNS_19NestedNameSpecifierEPKNS_14IdentifierInfoEjPKNS_16TemplateArgumentENS_8QualTypeE+356>:
>>    vst1.64   {d16-d17}, [r4]!
>> . . .
>> (gdb) info all-registers
>> r0             0xbfbf81a8    -1077968472
>> r1             0x22f07e14    586186260
>> r2             0xc416bc      12850876
>> r3             0x2   2
>> r4             0x22f07dfc    586186236
>> . . .
> 
> 
> Thus it appears that there is more code around that likely generates pointers 
> not aligned so to allow the code generation that is in use for what is 
> pointed to.
> 
> At this point I have no clue if the issue is just inside clang itself vs. if 
> it is in something that clang is layered on top of. Nor if there is just one 
> bad thing or many.
> 
> Note: I had not yet tried buildworld/buildkernel for the context of the "-f" 
> option that I was experimenting with earlier. So I do not have a direct 
> compare and contrast at this point.
> 
> 
> 
> Older material:
> 
> On 2015-Dec-25, at 5:21 PM, Mark Millard <[email protected]> wrote:
> 
>> On 2015-Dec-25, at 3:42 PM, Warner Losh <[email protected]> wrote:
>> 
>> 
>>> On Dec 25, 2015, at 3:14 PM, Mark Millard <[email protected]> wrote:
>>> 
>>> [I'm going to break much of the earlier "original material" text to tail of 
>>> the message.]
>>> 
>>>> On 2015-Dec-25, at 11:53 AM, Warner Losh <[email protected]> wrote:
>>>> 
>>>> So what happens if we actually fix the underlying bug?
>>>> 
>>>> I see two ways of doing this. In findfp.c, we allocate an array of FILE * 
>>>> today like:
>>>>    g = (struct glue *)malloc(sizeof(*g) + ALIGNBYTES + n * sizeof(FILE));
>>>> but that assumes that FILE just has normal pointer alignment requirements. 
>>>> However,
>>>> due to the mbstate having int64_t alignment requirements, this is wrong. 
>>>> Maybe we
>>>> need to do something like
>>>>    g = (struct glue *)malloc(sizeof(*g) + max(sizeof(int64_t),ALIGNBYTES) 
>>>> + n * sizeof(FILE));
>>>> which wouldn’t change anything on LP64 systems, but would result in proper 
>>>> alignment
>>>> for ILP32 systems. We’d have to fix the loop that uses ALIGN afterwards to 
>>>> use
>>>> roundup. Instead, we’d need to round up to the neared 8-byte aligned 
>>>> offset (or technically,
>>>> the max of ALIGNBYTES and 8, but that’s always 8 on today’s systems. If we 
>>>> do this,
>>>> we can make sure that each file is 8-byte aligned or better. We may need 
>>>> to round up
>>>> sizeof(FILE) to a multiple of 8 as well. I believe that since it has the 
>>>> 8-byte alignment
>>>> for a member, its size must be a multiple of 8, but I’ve not chased that 
>>>> belief to ground.
>>>> If not, we may need another decorator (__aligned(8), I think, spelled with 
>>>> the ugly
>>>> max expression above). That way, the contract we’re making with the 
>>>> compiler will
>>>> always be true. ALIGN BYTES is 4 on Arm anyway, so that bit is clearly 
>>>> wrong.
>>>> 
>>>> This wouldn’t be an ABI change, since you can only get a valid FILE * from 
>>>> fopen (and
>>>> friends), plus stdin, stdout, and stderr. Those addresses aren’t hard 
>>>> coded into binaries,
>>>> so even if we have to tweak the last three and deal with some ‘fake’ FILE 
>>>> abuse in libc
>>>> (which I don’t think suffers from this issue, btw, given the alignment 
>>>> requirements that would
>>>> naturally follow from something on the stack), we’d still be ahead. At 
>>>> least for all CONFORMING
>>>> implementations[*]...
>>>> 
>>>> TL;DR: Why not make FILE * always 8-byte aligned? The compiler options are 
>>>> a band-aide.
>>>> 
>>>> Warner
>>>> 
>>>> [*] There’s at least on popular package that has a copy of the FILE 
>>>> structure in one of its
>>>> .h files and uses that to do unnatural optimization things, but even 
>>>> that’s cool, I think,
>>>> since it never allocates a new one.
>>>> 
>>> 
>>> The ARM documentation mentions cases of 16 byte alignment requirements. 
>>> I've no clue if the clang code generation ever creates such code. There 
>>> might be wider requirements possible in arm code as well. (I'm not an arm 
>>> expert.) As an example of an implication: "The malloc() function returns a 
>>> pointer to a block of at least size bytes suitably aligned for any use." In 
>>> other words: aligned to some figure that is a multiple of *every* alignment 
>>> requirement that the code generator can produce, possibly being the least 
>>> common multiple.
>>> 
>>> "-fmax-type-align=. . ." is a means of controlling/limiting the range of 
>>> potential alignments to no more than a fixed, predefined value. Above that 
>>> and the code generation has to work in small size accesses and 
>>> build-up/split-up bigger values. Using "-fmax-type-align=. . ." allows 
>>> defining a figure as part of an ABI that is then not subject to code 
>>> generator updates that could increase the maximum alignment figure and 
>>> break things: It turns off such new capabilities. Other options need not 
>>> work that way to preserve the ABI.
>> 
>> That’s true, as far as it goes… But I’m not sure it goes far enough. The 
>> premise here is that the problem is wide-spread, when in fact I think it is 
>> quite narrow.
>> 
>>> But in the most fundamental terms process wise as far as I can tell. . .
>>> 
>>> While the FILE case that occurred is a specific example, every 
>>> memory-allocation-like operation is at a potential issue for all such 
>>> "allocated" objects where the related code generation requires alignment to 
>>> avoid Bus Error (given the SCTLR bit[1] in use).
>> 
>> The problem isn’t general. The problem isn’t malloc. Malloc will generally 
>> return the right thing on arm (and if it doesn’t,
>> then we need to make sure it does).
>> 
>> The problem is we get a boatload of FILEs from the system all at once, and 
>> those are misaligned because of a bug in the code. One that’s fixed, I 
>> believe, in https://reviews.freebsd.org/D4708.
>> 
>> 
>>> How many other places in FreeBSD might sometimes return mis-aligned 
>>> pointers for the existing code generation and ABI combination?
>> 
>> It isn’t an ABI thing, just a code bug thing. The only reason it was an 
>> issue was due to the optimizing nature of clang.
>> 
>> We’ve had to deal with the arm alignment issues for years. I wager there are 
>> very few indeed. The only reason this was was brought to light was better 
>> code-gen from clang.
>> 
>>> How many other places are subject to breakage when "internal" 
>>> structs/unions/fields involved are changed to be of a different size 
>>> because the code is not fully auto-adjusting to match the code generation 
>>> yet --even if right now "it works"? How fragile will things be for future 
>>> work?
>> 
>> If there are others, I’ll bet they could be counted on one hand since very 
>> few things do the ‘slab’ allocator that FILE does.
>> 
>>> What would it take to find out and deal with them all? (I do not have the 
>>> background knowledge to span much.)
>>> 
>>> My experiment avoided potentially changing parts of the ABI and also 
>>> avoided dealing with such a "lots of code to investigate" issue. It may not 
>>> be the long term 11.0-RELEASE solution. Even if not, it may be appropriate 
>>> for various temporary purposes that need to avoid Bus Errors in the 
>>> process. For example if Ian has a good reason to use clang 3.7 instead of 
>>> gcc 4.2.1.
>> 
>> The review above doesn’t change the ABI either.
>> 
>>> Other notes:
>>> 
>>>> I believe that since it has the 8-byte alignment
>>>> for a member, its size must be a multiple of 8
>>> 
>>> There are some C/C++ language rules about the address of a structure 
>>> equalling the address of the first field, uniformity of the offsets, and 
>>> the like. But. . .
>>> 
>>> The C and C++ languages specify no specific numerical alignment figures, 
>>> not even relative to specific sizeof(...) expressions. To use an old 
>>> example: a 68010 only needs alignment for >= 2 byte things and even 
>>> alignment is all that is then required. Some other contexts take a lot more 
>>> to meet the specifications. There are some implications of the modern 
>>> memory model(s) created to cover concurrency explicitly, such as avoiding 
>>> interactions that can happen via, for example, separate objects (in part) 
>>> sharing a cache line. (I've only looked at C++ for this, and only to a 
>>> degree.)
>>> 
>>> The detailed alignment rules are more "implementation defined" than 
>>> "predefined by the standard". But the definition is trying to meet language 
>>> criteria. It is not a fully independent choice.
>> 
>> Many of them are actually defined by a combination of the standard language 
>> definition, as well as the ABI standard. This is why we know that mbstate_t 
>> must be 8 byte aligned.
>> 
>>> May be some other standards that FreeBSD is tied to specify more specifics, 
>>> such as a N byte integer always aligns to some multiple of N (a waste on 
>>> the 68010), including the alignment for union or struct that it may be a 
>>> part of tracking. But such rules force padding that may or may not be 
>>> required to meet the language's more abstract criteria and such rules may 
>>> not match the existing/in-use ABI.
>> 
>> It is all spelled out in the ARM EABI docs.
>> 
>>> So far as I can tell explicitly declared alignments may well be necessary. 
>>> If that one "popular package", say, formed an array of FILE copies then the 
>>> resultant alignments need not all match the ones produced by your example 
>>> code unless the FILE declaration forces the compiler to match, causing 
>>> sizeof(FILE) to track as well. FILE need not be the only such issue.
>> 
>> Arrays of FILEs isn’t an issue (except that it encodes the size of FILE into 
>> the app). It’s the specifically quirky way that libc does it that’s the 
>> problem.
>> 
>>> My background and reference material are mostly tied the languages --and so 
>>> my notes tend to be limited to that much context.
>> 
>> Understood. While there may be issues with alignment still, tossing a big 
>> hammer at the problem because they might exist will likely mean they will 
>> persist far longer than fixing them one at a time. When we first ported to 
>> arm, there were maybe half a dozen places that needed fixing. I doubt 
>> there’s more now.
>> 
>> Can you try the patch in the above code review w/o the -f switch and let me 
>> know if it works for you?
>> 
>> Warner
> 
> buildworld/buildkernel has been started on amd64 for a rpi2 target. That and 
> install kernel/world and starting up a port rebuild on the rpi2 and waiting 
> for it means it will be a few hours even if I start the next thing just as 
> each prior thing finishes. I may give up and go to sleep first.
> 
> As for presumptions: I'll take your word on expected status of things. I've 
> no clue. But absent even the hear-say status information at the time I did 
> not presume that what was in front of me was all there is to worry about 
> --nor did I try to go figure it all out on my own. I took a path to cover 
> both possibilities for local-only vs. more-wide-spread (so long as that path 
> did not force a split-up of some larger form of atomic action).
> 
> In my view "-mno-unaligned-access" is an even bigger hammer than I used. I 
> find no clang statement about what its ABI consequences would be, unlike for 
> what I did: What mix of more padding for alignment vs. more but smaller 
> accesses? But as I remember I've seen "-mno-unaligned-access" in use in ports 
> and the like so its consequences may be familiar material for some folks.
> 
> Absent any questions about ABI consequences "-mno-unaligned-access" does well 
> mark the expected SCTLR bit[1] status, far better than what I did. Again: I 
> was covering my ignorance while making any significant 
> investigation/debugging as unlikely as I could.
> 
> 
>> Original material:
>> 
>>> On Dec 25, 2015, at 7:24 AM, Mark Millard <[email protected]> wrote:
>>> 
>>> [Good News Summary: Rebuilding buildworld/buildkernel for rpi2 11.0-CURRENT 
>>> 292413 from amd64 based on adding -fmax-type-align=4 has so far removed the 
>>> crashes during the toolchain activity: no more misaligned accesses in 
>>> libc's _fseeko or elsewhere.]
>>> 
>>> On 2015-Dec-25, at 12:31 AM, Mark Millard <[email protected]> wrote:
>>> 
>>>> On 2015-Dec-24, at 10:39 PM, Mark Millard <[email protected]> wrote:
>>>> 
>>>>> [I do not know if this partial crash analysis related to on-arm 
>>>>> clang-associated activity is good enough and appropriate to submit or 
>>>>> not.]
>>>>> 
>>>>> The /usr/local/arm-gnueabi-freebsd/bin/ar on the rpi2b involved below 
>>>>> came from pkg install activity instead of port building. Used as-is.
>>>>> 
>>>>> When I just tried my first from-rpi2b builds (ports for a rpi2b), 
>>>>> /usr/local/arm-gnueabi-freebsd/bin/ar crashed. I believe that the 
>>>>> following suggests an alignment error for the type of instructions that 
>>>>> memset for 128 bytes was translated to (sizeof(mbstate_t)) in the code 
>>>>> used by /usr/local/arm-gnueabi-freebsd/bin/ar. (But I do not know how to 
>>>>> check SCTLR bit[1] to be directly sure that alignment was being enforced.)
>>>>> 
>>>>> The crash was a Bus error in /usr/local/arm-gnueabi-freebsd/bin/ar :
>>>>> 
>>>>>> libtool: link: /usr/local/arm-gnueabi-freebsd/bin/ar cru 
>>>>>> .libs/libgnuintl.a  bindtextdom.o dcgettext.o dgettext.o gettext.o 
>>>>>> finddomain.o hash-string.o loadmsgcat.o localealias.o textdomain.o 
>>>>>> l10nflist.o explodename.o dcigettext.o dcngettext.o dngettext.o 
>>>>>> ngettext.o pluralx.o plural-exp.o localcharset.o threadlib.o lock.o 
>>>>>> relocatable.o langprefs.o localename.o log.o printf.o setlocale.o 
>>>>>> version.o xsize.o osdep.o intl-compat.o
>>>>>> Bus error (core dumped)
>>>>>> *** [libgnuintl.la] Error code 138
>>>>> 
>>>>> It failed in _fseeko doing a memset that turned into uses of "vst1.64     
>>>>> {d16-d17}, [r0]" instructions, for an address in register r0 that ended 
>>>>> in 0xa4, so was not aligned to 8 byte boundaries. From what I read such 
>>>>> "VSTn (multiple n-element structures)" that have .64 require 8 byte 
>>>>> alignment. The evidence of the code and register value follow.
>>>>> 
>>>>>> # gdb /usr/local/arm-gnueabi-freebsd/bin/ar 
>>>>>> /usr/obj/portswork/usr/ports/devel/gettext-tools/work/gettext-0.19.6/gettext-tools/intl/ar.core
>>>>>> . . .
>>>>>> #0  0x2033adcc in _fseeko (fp=0x20651dcc, offset=<value optimized out>, 
>>>>>> whence=<value optimized out>, ltest=<value optimized out>) at 
>>>>>> /usr/src/lib/libc/stdio/fseek.c:299
>>>>>> 299              memset(&fp->_mbstate, 0, sizeof(mbstate_t));
>>>>>> . . .
>>>>>> (gdb) x/24i 0x2033adb0
>>>>>> 0x2033adb0 <_fseeko+836>:        vmov.i32        q8, #0  ; 0x00000000
>>>>>> 0x2033adb4 <_fseeko+840>:        movw    r1, #65503      ; 0xffdf
>>>>>> 0x2033adb8 <_fseeko+844>:        stm     r4, {r0, r7}
>>>>>> 0x2033adbc <_fseeko+848>:        ldrh    r0, [r4, #12]
>>>>>> 0x2033adc0 <_fseeko+852>:        and     r0, r0, r1
>>>>>> 0x2033adc4 <_fseeko+856>:        strh    r0, [r4, #12]
>>>>>> 0x2033adc8 <_fseeko+860>:        add     r0, r4, #216    ; 0xd8
>>>>>> 0x2033adcc <_fseeko+864>:        vst1.64 {d16-d17}, [r0]
>>>>>> 0x2033add0 <_fseeko+868>:        add     r0, r4, #200    ; 0xc8
>>>>>> 0x2033add4 <_fseeko+872>:        vst1.64 {d16-d17}, [r0]
>>>>>> 0x2033add8 <_fseeko+876>:        add     r0, r4, #184    ; 0xb8
>>>>>> 0x2033addc <_fseeko+880>:        vst1.64 {d16-d17}, [r0]
>>>>>> 0x2033ade0 <_fseeko+884>:        add     r0, r4, #168    ; 0xa8
>>>>>> 0x2033ade4 <_fseeko+888>:        vst1.64 {d16-d17}, [r0]
>>>>>> 0x2033ade8 <_fseeko+892>:        add     r0, r4, #152    ; 0x98
>>>>>> 0x2033adec <_fseeko+896>:        vst1.64 {d16-d17}, [r0]
>>>>>> 0x2033adf0 <_fseeko+900>:        add     r0, r4, #136    ; 0x88
>>>>>> 0x2033adf4 <_fseeko+904>:        vst1.64 {d16-d17}, [r0]
>>>>>> 0x2033adf8 <_fseeko+908>:        add     r0, r4, #120    ; 0x78
>>>>>> 0x2033adfc <_fseeko+912>:        vst1.64 {d16-d17}, [r0]
>>>>>> 0x2033ae00 <_fseeko+916>:        add     r0, r4, #104    ; 0x68
>>>>>> 0x2033ae04 <_fseeko+920>:        vst1.64 {d16-d17}, [r0]
>>>>>> 0x2033ae08 <_fseeko+924>:        b       0x2033b070 <_fseeko+1540>
>>>>>> 0x2033ae0c <_fseeko+928>:        cmp     r5, #0  ; 0x0
>>>>>> (gdb) info all-registers
>>>>>> r0             0x20651ea4        543497892
>>>>>> r1             0xffdf    65503
>>>>>> r2             0x0       0
>>>>>> r3             0x0       0
>>>>>> r4             0x20651dcc        543497676
>>>>>> r5             0x0       0
>>>>>> r6             0x0       0
>>>>>> r7             0x0       0
>>>>>> r8             0x20359df4        540384756
>>>>>> r9             0x0       0
>>>>>> r10            0x0       0
>>>>>> r11            0xbfbfb948        -1077954232
>>>>>> r12            0x2037b208        540520968
>>>>>> sp             0xbfbfb898        -1077954408
>>>>>> lr             0x2035a004        540385284
>>>>>> pc             0x2033adcc        540257740
>>>>>> f0             0 (raw 0x000000000000000000000000)
>>>>>> f1             0 (raw 0x000000000000000000000000)
>>>>>> f2             0 (raw 0x000000000000000000000000)
>>>>>> f3             0 (raw 0x000000000000000000000000)
>>>>>> f4             0 (raw 0x000000000000000000000000)
>>>>>> f5             0 (raw 0x000000000000000000000000)
>>>>>> f6             0 (raw 0x000000000000000000000000)
>>>>>> f7             0 (raw 0x000000000000000000000000)
>>>>>> fps            0x0       0
>>>>>> cpsr           0x60000010        1610612752
>>>>> 
>>>>> The syntax in use for vst1.64 instructions does not explicitly have the 
>>>>> alignment notation. Presuming that the decoding is correct then from what 
>>>>> I read the following applies:
>>>>> 
>>>>>> Home > NEON and VFP Programming > NEON load and store element and 
>>>>>> structure instructions > Alignment restrictions in load and store, 
>>>>>> element and structure instructions
>>>>>> 
>>>>>> . . . When the alignment is not specified in the instruction, the 
>>>>>> alignment restriction is controlled by the A bit (SCTLR bit[1]):
>>>>>>  •       if the A bit is 0, there are no alignment restrictions (except 
>>>>>> for strongly ordered or device memory, where accesses must be element 
>>>>>> aligned or the result is unpredictable)
>>>>>>  •       if the A bit is 1, accesses must be element aligned.
>>>>>> If an address is not correctly aligned, an alignment fault occurs.
>>>>> 
>>>>> So if at the time the "A bit" (SCTLR bit[1]) is 1 then the Bus error 
>>>>> would have the context to happen because of the mis-alignment.
>>>>> 
>>>>> The following shows the make.conf context that explains how 
>>>>> /usr/local/arm-gnueabi-freebsd/bin/ar came to be invoked:
>>>>> 
>>>>>> # more /etc/make.conf
>>>>>> WRKDIRPREFIX=/usr/obj/portswork
>>>>>> WITH_DEBUG=
>>>>>> WITH_DEBUG_FILES=
>>>>>> MALLOC_PRODUCTION=
>>>>>> #
>>>>>> TO_TYPE=armv6
>>>>>> TOOLS_TO_TYPE=arm-gnueabi
>>>>>> CROSS_BINUTILS_PREFIX=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/
>>>>>> .if ${.MAKE.LEVEL} == 0
>>>>>> CC=/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi -march=armv7a
>>>>>> CXX=/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi 
>>>>>> -march=armv7a
>>>>>> CPP=/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi 
>>>>>> -march=armv7a
>>>>>> .export CC
>>>>>> .export CXX
>>>>>> .export CPP
>>>>>> AS=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as
>>>>>> AR=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar
>>>>>> LD=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld
>>>>>> NM=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm
>>>>>> OBJCOPY=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy
>>>>>> OBJDUMP=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump
>>>>>> RANLIB=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib
>>>>>> SIZE=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size
>>>>>> #NO-SUCH: STRINGS=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings
>>>>>> STRINGS=/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings
>>>>>> .export AS
>>>>>> .export AR
>>>>>> .export LD
>>>>>> .export NM
>>>>>> .export OBJCOPY
>>>>>> .export OBJDUMP
>>>>>> .export RANLIB
>>>>>> .export SIZE
>>>>>> .export STRINGS
>>>>>> .endif
>>>>> 
>>>>> 
>>>>> Other context:
>>>>> 
>>>>>> # freebsd-version -ku; uname -aKU
>>>>>> 11.0-CURRENT
>>>>>> 11.0-CURRENT
>>>>>> FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r292413M: Tue Dec 22 
>>>>>> 22:02:21 PST 2015     
>>>>>> root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG  arm 
>>>>>> 1100091 1100091
>>>>> 
>>>>> 
>>>>> 
>>>>> I will note that world and kernel are my own build of -r292413 (earlier 
>>>>> experiment) --a build made from an amd64 host context and put in place 
>>>>> via DESTDIR=. My expectation would be that the amd64 context would not be 
>>>>> likely to have similar alignment restrictions involved in its ar activity 
>>>>> (or other activity). That would explain how I got this far using such a 
>>>>> clang 3.7 related toolchain for targeting an rpi2 before finding such a 
>>>>> problem.
>>>> 
>>>> 
>>>> I realized re-reading the all above that it seems to suggest that the 
>>>> _fseeko code involved is from /usr/local/arm-gnueabi-freebsd/bin/ar but 
>>>> that was not my intent.
>>>> 
>>>> libc.so.7 is from my buildworld, including the fseeko implementation:
>>>> 
>>>> Reading symbols from /lib/libc.so.7...Reading symbols from 
>>>> /usr/lib/debug//lib/libc.so.7.debug...done.
>>>> done.
>>>> Loaded symbols for /lib/libc.so.7
>>>> 
>>>> 
>>>> head/sys/sys/_types.h has:
>>>> 
>>>> /*
>>>> * mbstate_t is an opaque object to keep conversion state during multibyte
>>>> * stream conversions.
>>>> */
>>>> typedef union {
>>>>  char            __mbstate8[128];
>>>>  __int64_t       _mbstateL;      /* for alignment */
>>>> } __mbstate_t;
>>>> 
>>>> suggesting an implicit alignment of the union to whatever the 
>>>> implementation defines for __int64_t --which need not be 8 byte alignment 
>>>> (in the abstract, general case). But 8 byte alignment is a possibility as 
>>>> well (in the abstract).
>>>> 
>>>> But printing *fp in gdb for the fp argument to _fseeko reports the same 
>>>> not-8-byte aligned address for __mbstate8 that was in r0:
>>>> 
>>>>> (gdb) bt
>>>>> #0  0x2033adcc in _fseeko (fp=0x20651dcc, offset=<value optimized out>, 
>>>>> whence=<value optimized out>, ltest=<value optimized out>) at 
>>>>> /usr/src/lib/libc/stdio/fseek.c:299
>>>>> #1  0x2033b108 in fseeko (fp=0x20651dcc, offset=18571438587904, whence=0) 
>>>>> at /usr/src/lib/libc/stdio/fseek.c:82
>>>>> #2  0x00016138 in ?? ()
>>>>> (gdb) print fp
>>>>> $2 = (FILE *) 0x20651dcc
>>>>> (gdb) print *fp
>>>>> $3 = {_p = 0x2069a240 "", _r = 0, _w = 0, _flags = 5264, _file = 36, _bf 
>>>>> = {_base = 0x2069a240 "", _size = 32768}, _lbfsize = 0, _cookie = 
>>>>> 0x20651dcc, _close = 0x20359dfc <__sclose>,
>>>>> _read = 0x20359de4 <__sread>, _seek = 0x20359df4 <__sseek>, _write = 
>>>>> 0x20359dec <__swrite>, _ub = {_base = 0x0, _size = 0}, _up = 0x0, _ur = 
>>>>> 0, _ubuf = 0x20651e0c "", _nbuf = 0x20651e0f "", _lb = {
>>>>> _base = 0x0, _size = 0}, _blksize = 32768, _offset = 0, _fl_mutex = 0x0, 
>>>>> _fl_owner = 0x0, _fl_count = 0, _orientation = 0, _mbstate = {__mbstate8 
>>>>> = 0x20651e34 "", _mbstateL = 0}, _flags2 = 0}
>>>> 
>>>> The overall FILE struct containing the _mbstate field is also not 8-byte 
>>>> aligned. But the offset from the start of the FILE struct to __mbstate8 is 
>>>> a multiple of 8 bytes.
>>>> 
>>>> It is my interpretation that there is nothing here to justify the memset 
>>>> implementation combination:
>>>> 
>>>> SCTLR bit[1]==1
>>>> 
>>>> mixed with
>>>> 
>>>> vst1.64 instructions
>>>> 
>>>> I.e.: one or both needs to change unless some way for forcing 8-byte 
>>>> alignment is introduced.
>>>> 
>>>> I have not managed to track down anything that would indicate FreeBSD's 
>>>> intent for SCTLR bit[1]. I do not even know if it is required by the 
>>>> design to be constant (once initialized).
>>> 
>>> 
>>> I have (so far) removed the build tool crashes based on adding 
>>> -fmax-type-align=4 to avoid the misaligned accesses. Details follow.
>>> 
>>> src.conf on amd64 for the rpi2 targeting buildworld/buildkernel now looks 
>>> like:
>>> 
>>>> # more ~/src.configs/src.conf.rpi2-clang.amd64-host
>>>> TO_TYPE=armv6
>>>> TOOLS_TO_TYPE=arm-gnueabi
>>>> FROM_TYPE=amd64
>>>> TOOLS_FROM_TYPE=x86_64
>>>> VERSION_CONTEXT=11.0
>>>> #
>>>> KERNCONF=RPI2-NODBG
>>>> TARGET=arm
>>>> .if ${.MAKE.LEVEL} == 0
>>>> TARGET_ARCH=${TO_TYPE}
>>>> .export TARGET_ARCH
>>>> .endif
>>>> #
>>>> WITHOUT_CROSS_COMPILER=
>>>> #
>>>> # For WITH_BOOT= . . .
>>>> # arm-gnueabi-freebsd/bin/ld reports bootinfo.o: relocation 
>>>> R_ARM_MOVW_ABS_NC against `a local symbol' can not be used when making a 
>>>> shared object; recompile with -fPIC
>>>> WITHOUT_BOOT=
>>>> #
>>>> WITH_FAST_DEPEND=
>>>> WITH_LIBCPLUSPLUS=
>>>> WITH_CLANG=
>>>> WITH_CLANG_IS_CC=
>>>> WITH_CLANG_FULL=
>>>> WITH_LLDB=
>>>> WITH_CLANG_EXTRAS=
>>>> #
>>>> WITHOUT_LIB32=
>>>> WITHOUT_GCC=
>>>> WITHOUT_GNUCXX=
>>>> #
>>>> NO_WERROR=
>>>> MALLOC_PRODUCTION=
>>>> #CFLAGS+= -DELF_VERBOSE
>>>> #
>>>> WITH_DEBUG=
>>>> WITH_DEBUG_FILES=
>>>> #
>>>> # TOOLS_TO_TYPE based on ${TO_TYPE}-xtoolchain-gcc related bintutils...
>>>> #
>>>> #CROSS_TOOLCHAIN=${TO_TYPE}-gcc
>>>> X_COMPILER_TYPE=clang
>>>> CROSS_BINUTILS_PREFIX=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/
>>>> .if ${.MAKE.LEVEL} == 0
>>>> XCC=/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi -march=armv7a 
>>>> -fmax-type-align=4
>>>> XCXX=/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi 
>>>> -march=armv7a -fmax-type-align=4
>>>> XCPP=/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi 
>>>> -march=armv7a -fmax-type-align=4
>>>> .export XCC
>>>> .export XCXX
>>>> .export XCPP
>>>> XAS=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as
>>>> XAR=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar
>>>> XLD=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld
>>>> XNM=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm
>>>> XOBJCOPY=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy
>>>> XOBJDUMP=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump
>>>> XRANLIB=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib
>>>> XSIZE=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size
>>>> #NO-SUCH: XSTRINGS=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings
>>>> XSTRINGS=/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings
>>>> .export XAS
>>>> .export XAR
>>>> .export XLD
>>>> .export XNM
>>>> .export XOBJCOPY
>>>> .export XOBJDUMP
>>>> .export XRANLIB
>>>> .export XSIZE
>>>> .export XSTRINGS
>>>> .endif
>>>> #
>>>> # Host compiler stuff:
>>>> .if ${.MAKE.LEVEL} == 0
>>>> CC=/usr/bin/clang -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin
>>>> CXX=/usr/bin/clang++ -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin
>>>> CPP=/usr/bin/clang-cpp -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin
>>>> .export CC
>>>> .export CXX
>>>> .export CPP
>>>> AS=/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/as
>>>> AR=/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ar
>>>> LD=/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ld
>>>> NM=/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/nm
>>>> OBJCOPY=/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objcopy
>>>> OBJDUMP=/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objdump
>>>> RANLIB=/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ranlib
>>>> SIZE=/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/size
>>>> #NO-SUCH: STRINGS=/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/strings
>>>> STRINGS=/usr/local/bin/${TOOLS_FROM_TYPE}-freebsd-strings
>>>> .export AS
>>>> .export AR
>>>> .export LD
>>>> .export NM
>>>> .export OBJCOPY
>>>> .export OBJDUMP
>>>> .export RANLIB
>>>> .export SIZE
>>>> .export STRINGS
>>>> .endif
>>> 
>>> make.conf for during the on-rpi2 port builds now looks like:
>>> 
>>>> $ more /etc/make.conf
>>>> WRKDIRPREFIX=/usr/obj/portswork
>>>> WITH_DEBUG=
>>>> WITH_DEBUG_FILES=
>>>> MALLOC_PRODUCTION=
>>>> #
>>>> TO_TYPE=armv6
>>>> TOOLS_TO_TYPE=arm-gnueabi
>>>> CROSS_BINUTILS_PREFIX=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/
>>>> .if ${.MAKE.LEVEL} == 0
>>>> CC=/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi -march=armv7a 
>>>> -fmax-type-align=4
>>>> CXX=/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi -march=armv7a 
>>>> -fmax-type-align=4
>>>> CPP=/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi 
>>>> -march=armv7a -fmax-type-align=4
>>>> .export CC
>>>> .export CXX
>>>> .export CPP
>>>> AS=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as
>>>> AR=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar
>>>> LD=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld
>>>> NM=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm
>>>> OBJCOPY=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy
>>>> OBJDUMP=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump
>>>> RANLIB=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib
>>>> SIZE=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size
>>>> #NO-SUCH: STRINGS=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings
>>>> STRINGS=/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings
>>>> .export AS
>>>> .export AR
>>>> .export LD
>>>> .export NM
>>>> .export OBJCOPY
>>>> .export OBJDUMP
>>>> .export RANLIB
>>>> .export SIZE
>>>> .export STRINGS
>>>> .endif
>>> 
>>> 
>>> 
>>> ===
>>> Mark Millard
>>> markmi at dsl-only.net
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> [email protected] mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain
>>> To unsubscribe, send any mail to "[email protected]"
> 
> 
> 

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to