[Just the __packed removal patch was sufficient to no longer have the hang problem that I originally reported for the print/texinfo build in poudriere.]
On 2018-Oct-27, at 4:33 PM, Mark Millard <marklmi at yahoo.com> wrote: > [Some of this discussion occurred off list. The point here > is not specific to the hang that I originally reported.] > > On 2018-Oct-27, at 3:03 PM, Mark Millard <marklmi at yahoo.com> wrote: >> Mikaël Urankar is being quoted below: >>> . . . >>> >>>> There are bugs in qemu that can cause such deadlock, you can try these >>>> 2 patches: >>>> https://github.com/MikaelUrankar/qemu-bsd-user/commit/9424a5ffde4de2768ab6baa45fdbe0dbb56a7371 >>>> https://github.com/MikaelUrankar/qemu-bsd-user/commit/d6f65a7f07d280b6906d499d8e465d4d2026c52b Back to me: >>> I'll try those later. Thanks. (I need to get back to sleep.) >>> >>> It was interesting that attach/detach to the ld process >>> caused it to progress. The rest of the build completed >>> just fine. But that one spot consistently hung up before >>> trying gdb to look at the back trace. >>> >> >> Looking at the qemu code related to the 2nd patch: the >> structure of the field copies (via __get_user) seems >> very sensitive to the ABI rules for the target and >> how things align and such, given that the structure >> description and code are host code. __packed vs. not >> is possibly not sufficient control to always make things >> match right across all the potential combinations of >> host and target from what I can see. >> >> Lack of __packed may prove sufficient for my specific >> context (amd64 host and armv7 target) but it seems >> non-obvious what to do in general. >> >> There would also seem to be big endian vs. little endian >> issues on the individual __get_user styles of copies >> when the host and target do not match for a multi-byte >> numeric encoding. > > Well, I get the following for: > > #include "/usr/include/sys/event.h" // kevent > #include <stddef.h> // offsetof > #include <stdio.h> // printf > > int > main() > { > printf("%lu\n", (unsigned long) sizeof(struct kevent)); > printf("ident %lu\n", (unsigned long) offsetof(struct kevent, ident)); > printf("filter %lu\n", (unsigned long) offsetof(struct kevent, > filter)); > printf("flags %lu\n", (unsigned long) offsetof(struct kevent, flags)); > printf("fflags %lu\n", (unsigned long) offsetof(struct kevent, > fflags)); > printf("data %lu\n", (unsigned long) offsetof(struct kevent, data)); > printf("udata %lu\n", (unsigned long) offsetof(struct kevent, udata)); > printf("ext %lu\n", (unsigned long) offsetof(struct kevent, ext)); > return 0; > } > > (This code avoided warnings for type mismatches with the > printf strings and such.) > > amd64 native [host of qemu use] (comments hand added): > > # ./a.out > 64 > ident 0 > filter 8 // NOTE! > flags 10 // NOTE! > fflags 12 // NOTE! > data 16 > udata 24 > ext 32 > > (The above is not particularly important but I > include it for completeness.) > > armv7 native [target in qemu use] (comments hand added): > > # ./a.out > 64 // NOTE vs. below! > ident 0 > filter 4 // NOTE vs. above! > flags 6 // NOTE vs. above! > fflags 8 // NOTE vs. above! > data 16 // NOTE vs. below! > udata 24 // NOTE vs. below! > ext 32 // NOTE vs. below! > > /usr/include/sys/event.h lacks __packed in both cases. > > With __packed in qemu-arm-static's source code > for target_freebsd_kevent I confirm that via > gdb for the qemu-arm-static: > > p/d sizeof(struct target_freebsd_kevent) > p/d &((struct target_freebsd_kevent *)0)->ident > p/d &((struct target_freebsd_kevent *)0)->filter > p/d &((struct target_freebsd_kevent *)0)->flags > p/d &((struct target_freebsd_kevent *)0)->fflags > p/d &((struct target_freebsd_kevent *)0)->data > p/d &((struct target_freebsd_kevent *)0)->udata > p/d &((struct target_freebsd_kevent *)0)->ext > > reports as the 2nd patch's problem-report > material reports (56,0,4,6,8,12,20,24): not > even the right size. > > I also confirm that removing __packed in qemu's > code and rebuilding and then checking with gdb > reported a match to the above armv7 native report > (64,0,4,6,8,16,24,32). > > I have not verified __packed used vs. not for any > other combination of host and target platforms. Removing the 2 examples of __packed, including the 1 for target_freebsd_kevent, as in Mikaël Urankar's 2nd listed patch, was sufficient to avoid the hang that I originally reported. (Technically FreeBSD 11 is not involved and so one of the __packed removals is not relevant to my example.) I have not applied Mikaël Urankar's first listed patch at all. It did not prove necessary for my context. Again: the only tested context is amd64 -> armv7 (host -> target) under a head -r339076 based build. (So still 12.) I'm doing a larger amd64 -> armv7 rebuild (around 210 ports overall) that originally included the problematical hang and a full-bootstrap build of lang/gcc8 (so extensive emulation use after the clang-based stages). Prior to the patch, all smaller attempts also hung at the same place for print/texinfo. But I'll only report if this larger test has a problem. === Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) _______________________________________________ freebsd-toolchain@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain To unsubscribe, send any mail to "freebsd-toolchain-unsubscr...@freebsd.org"