On Thu, Jun 27, 2019 at 5:26 AM Uros Bizjak <ubiz...@gmail.com> wrote: > > On Thu, Jun 27, 2019 at 2:23 PM Jan Beulich <jbeul...@suse.com> wrote: > > > > >>> On 27.06.19 at 14:00, <ubiz...@gmail.com> wrote: > > > On Thu, Jun 27, 2019 at 1:46 PM Jan Beulich <jbeul...@suse.com> wrote: > > >> > > >> >>> On 27.06.19 at 13:09, <ubiz...@gmail.com> wrote: > > >> > On Thu, Jun 27, 2019 at 12:11 PM Jan Beulich <jbeul...@suse.com> wrote: > > >> >> > > >> >> Without these constraints asm() can't make use of mask registers. > > >> > > > >> > asm should be deprecated. We have intrinsics for this purpose. > > >> > > >> While maybe not explicitly applicable here, the intrinsics aren't > > >> (afaict) providing full flexibility. In particular (just as example) > > >> I haven't found a way to use embedded broadcast with the > > >> intrinsics, but I can easily do so with asm(). > > >> > > >> Furthermore there are other reasons to use asm() - things like > > >> the Linux kernel are full of it for a reason. And once one has > > >> to use asm(), the resulting code typically is easier to follow if > > >> one doesn't further intermix it with uses of builtins. > > >> > > >> And finally, if asm() was indeed meant to be deprecated, how > > >> come it pretty recently got extended to allow for "inline"? > > > > > > I didn't mean that asm() in general should be deprecated, but for SSE > > > and other vector extensions, where intrinsics are available, > > > intrinsics should be used instead. There was exactly zero requests to > > > use new asm constraints, it looks that people are satisfied with > > > intrinsics approach (which is also future-proof, etc). > > > > So what about my embedded broadcast example then? "Zero > > requests" is clearly not exactly right. It simply didn't occur to me > > (until I noticed the @internal here) that I should raise such a > > request, rather than just using asm(). Subsequently I did then > > notice "Yh" going away, complicating things further ... > > There was some work by HJ a while ago that implemented automatic > generation of embedded broadcasts. Perhaps he can answer the question.
I implemented broadcast for some commonly used instructions. Adding broadcast to all will take time, especially when there is no user demand. > Uros. > > > I'd also like to note that the choice of types on some of the builtins > > makes it rather cumbersome to use them. Especially for scalar > > operations I've found myself better resorting to asm(). See > > https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/tests/x86_emulator/simd.c > > (most of the changes submitted (not so) recently have been > > coming from the work of putting together this and its sibling > > tests for the Xen Project instruction emulator). > > > > Jan > > > > -- H.J.