On Thu, 22 Oct 2020 10:58:32 -0700 Andy Lutomirski <l...@kernel.org> wrote:
> On Thu, Oct 22, 2020 at 6:21 AM Masami Hiramatsu <mhira...@kernel.org> wrote: > > > > On Thu, 22 Oct 2020 11:30:44 +0200 > > Borislav Petkov <b...@alien8.de> wrote: > > > > > On Thu, Oct 22, 2020 at 04:31:00PM +0900, Masami Hiramatsu wrote: > > > > No, insn_get_length() implies it decodes whole of the instruction. > > > > (yeah, we need an alias of that, something like insn_get_complete()) > > > > > > That's exactly what I'm trying to point out: the whole API is not > > > entirely wrong - it just needs a better naming and documentation. Now, > > > the implication that getting the length of the insn will give you a full > > > decode is a totally internal detail which users don't need and have to > > > know. > > > > Ok, what names would you like to suggest? insn_get_complete()? > > > > > > I need insn.length too. Of course we can split it into 2 calls. But > > > > as I said, since the insn_get_length() implies it decodes all other > > > > parts, I just called it once. > > > > > > Yes, I have noticed that and wrote about it further on. The intent was > > > to show that the API needs work. > > > > > > > Hm, it is better to call insn_get_immediate() if it doesn't use length > > > > later. > > > > > > Ok, so you see the problem. This thing wants to decode the whole insn - > > > that's what the function is called. But it reads like it does something > > > else. > > > > > > > Would you mean we'd better have something like > > > > insn_get_until_immediate() ? > > > > > > > > Since the x86 instruction is CISC, we can not decode intermediate > > > > parts. The APIs follows that. If you are confused, I'm sorry about that. > > > > > > No, I'm not confused - again, I'd like for the API to be properly > > > defined and callers should not have to care which parts of the insn they > > > need to decode in order to get something else they actually need. > > > > Sorry, I can not get what you point. We already have those APIs, > > > > extern void insn_init(struct insn *insn, const void *kaddr, int buf_len, > > int x86_64); > > extern void insn_get_prefixes(struct insn *insn); > > extern void insn_get_opcode(struct insn *insn); > > extern void insn_get_modrm(struct insn *insn); > > extern void insn_get_sib(struct insn *insn); > > extern void insn_get_displacement(struct insn *insn); > > extern void insn_get_immediate(struct insn *insn); > > extern void insn_get_length(struct insn *insn); > > > > As I agreed, that we may need an alias of insn_get_length(). But it seems > > clear to me, if you need insn.immediate, you must call insn_get_immediate(). > > I'm guessing that the confusion here is that the kernel instruction > decoder was originally designed to be used to decode kernel > instructions, which are generally trusted to be valid, but that it's > starting to be used to decode user code and such as well. Hmm, right... > > Masami, could we perhaps have an extra API like: > > extern int insn_decode_fully(struct insn *insn); > > that decodes the *entire* instruction, returns success if the decoder > thinks the instruction is valid, and returns an error if the decoder > thinks it's invalid? We would use this when decoding arbitrary bytes > when we're not really sure that there's a valid instruction there. > For user code emulation, we don't really care much about performance > -- the overhead of getting #GP in the first place is much, much higher > than the overhead of decoding more of the instruction than needed. OK, would you think we also better to integrate it with insn_init()? > Ideally we would solve another little problem at the same time. Right > now, we are quite sloppy about how we fetch the instruction bytes, and > it might be nice to fix this. It would be nice if we could have a > special error code saying "more bytes are needed". So > insn_decode_fully() would return 0 (or the length) on a successful > decode, -EINVAL if the bytes are not a valid instruction, and -EAGAIN > (or something more appropriate) Maybe -ERANGE? > if the bytes are a valid *prefix* of > an instruction but more bytes are needed. Then the caller would do: > > len = min(15, remaining bytes in page); > fetch len bytes; > insn_init(); > ret = insn_decode_fully(); > if (ret == -EAGAIN) { > fetch remaining 15 - len bytes; > insn_init(); > ret = insn_decode_fully(); > } > > It's a bit impolite to potentially cause page faults on the page after > a short instruction, but it's also not so good to fail to decode a > long instruction that happens to cross a page boundary. OK. Borislav, would you handle it? I think you already started. Thank you, -- Masami Hiramatsu <mhira...@kernel.org>