On Wed, Oct 21, 2020 at 11:26:13PM +0900, Masami Hiramatsu wrote: > Hmm, I meant someone might think it can be used for filtering the > instruction something like, > > insn_init(insn, buf, buflen, 1); > ret = insn_get_length(insn); > if (!ret) { > /* OK, this is safe */ > patch_text(buf, trampoline); > } > > No, we need another validator for such usage.
Well, I think calling insn_get_length() should give you only the *length* of the insn and nothing else - I mean that is what the function is called. And I believe current use is wrong. Examples: arch/x86/kernel/kprobes/core.c: insn_get_length(&insn); /* * Another debugging subsystem might insert this breakpoint. * In that case, we can't recover it. */ if (insn.opcode.bytes[0] == INT3_INSN_OPCODE) So this has called get_length but it is far from clear that after that call, the opcode bytes in insn.opcode.bytes are there. What that should do instead IMO is this: insn_get_opcode(&insn); and *then* the return value can tell you whether the opcode bytes were parsed properly or not. See what I mean? That's even documented that way: /** * insn_get_opcode - collect opcode(s) * @insn: &struct insn containing instruction * * Populates @insn->opcode, updates @insn->next_byte to point past the * opcode byte(s), and set @insn->attr (except for groups). Similarly here: static enum es_result vc_decode_insn(struct es_em_ctxt *ctxt) ... insn_get_length(&ctxt->insn); ret = ctxt->insn.immediate.got ? ES_OK : ES_DECODE_FAILED; that thing wants to decode the insn but it is looking whether it parsed an *immediate*?! I'm not saying this is necessarily wrong - just the naming nomenclature and the API should be properly defined when you call a function of the insn decoder, what you are guaranteed to get and what a caller can assume after that. And then the proper functions be called. In the kprobes/core.c example above, it does a little further: ddr += insn.length; which, IMO, it should be either preceeded by a call to insn_get_length() - yes, this time we want the insn length or, the code should call a decoding function which gives you *both* length* and opcode bytes. insn_decode_insn() or whatever. And *that* should be documented in that function's kernel-doc section. And so on... Does that make more sense? Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette