On Fri, Sep 22, 2017 at 07:27:29AM -0700, Y Song wrote: > On Fri, Sep 22, 2017 at 7:11 AM, Y Song <ys114...@gmail.com> wrote: > > On Fri, Sep 22, 2017 at 6:46 AM, Edward Cree <ec...@solarflare.com> wrote: > >> On 22/09/17 00:11, Y Song wrote: > >>> On Thu, Sep 21, 2017 at 12:58 PM, Edward Cree <ec...@solarflare.com> > >>> wrote: > >>>> On 21/09/17 20:44, Alexei Starovoitov wrote: > >>>>> On Thu, Sep 21, 2017 at 09:29:33PM +0200, Daniel Borkmann wrote: > >>>>>> More intuitive, but agree on the from_be/le. Maybe we should > >>>>>> just drop the "to_" prefix altogether, and leave the rest as is since > >>>>>> it's not surrounded by braces, it's also not a cast but rather an op. > >>>> That works for me. > >>>>> 'be16 r4' is ambiguous regarding upper bits. > >>>>> > >>>>> what about my earlier suggestion: > >>>>> r4 = (be16) (u16) r4 > >>>>> r4 = (le64) (u64) r4 > >>>>> > >>>>> It will be pretty clear what instruction is doing (that upper bits > >>>>> become zero). > >>>> Trouble with that is that's very *not* what C will do with those casts > >>>> and it doesn't really capture the bidirectional/symmetry thing. The > >>>> closest I could see with that is something like `r4 = (be16/u16) r4`, > >>>> but that's quite an ugly mongrel. > >>>> I think Daniel's idea of `be16`, `le32` etc one-arg opcodes is the > >>>> cleanest and clearest. Should it be > >>>> r4 = be16 r4 > >>>> or just > >>>> be16 r4 > >>>> ? Personally I incline towards the latter, but admit it doesn't really > >>>> match the syntax of other opcodes. > >>> I did some quick prototyping in llvm to make sure we have a syntax > >>> llvm is happy. Apparently, llvm does not like the syntax > >>> r4 = be16 r4 or r4 = (be16) (u16) r4. > >>> > >>> In llvm:utils/TableGen/AsmMatcherEmitter.cpp: > >>> > >>> // Verify that any operand is only mentioned once. > >> Wait, how do you deal with (totally legal) r4 += r4? > >> Or r4 = *(r4 +0)? > >> Even jumps can have src_reg == dst_reg, though it doesn't seem useful. > > > > We are talking about dag node here. The above "r4", although using the same > > register, will be different dag nodes. So it will be okay. > > > > The "r4 = be16 r4" tries to use the *same* dag node as both source and > > destination > > in the asm output which is prohibited. > > With second thought, we may allow "r4 = be16 r4" by using different dag nodes. > (I need to do experiment for this.) But we do have constraints that > the two "r4" must > be the same register. "r5 = be16 r4" is not allowed. So from that > perspective, referencing > "r4" only once is a good idea and less confusing.
looks like we're converging on "be16/be32/be64/le16/le32/le64 #register" for BPF_END. I guess it can live with that. I would prefer more C like syntax to match the rest, but llvm parsing point is a strong one. For BPG_NEG I prefer to do it in C syntax like interpreter does: ALU_NEG: DST = (u32) -DST; ALU64_NEG: DST = -DST; Yonghong, does it mean that asmparser will equally suffer?