On Fri, 23 Oct 2020 17:12:49 -0700 Andy Lutomirski <l...@kernel.org> wrote:
> On Fri, Oct 23, 2020 at 4:27 PM Borislav Petkov <b...@alien8.de> wrote: > > > > On Fri, Oct 23, 2020 at 07:47:04PM +0900, Masami Hiramatsu wrote: > > > Thanks! I look forward to it. > > > > Ok, here's a first stab, it is a single big diff and totally untested > > but it should show what I mean. I've made some notes while converting, > > as I went along. > > > > Have a look at insn_decode() and its call sites: they are almost trivial > > now because caller needs simply to do: > > > > if (insn_decode(insn, buffer, ...)) > > > > and not care about any helper functions. > > > > For some of the call sites it still makes sense to do a piecemeal insn > > decoding and I've left them this way but they can be converted too, if > > one wants. > > > > In any case, just have a look please and lemme know if that looks OKish. > > I'll do the actual splitting and testing afterwards. > > > > And what Andy wants can't be done with the decoder because it already > > gets a fixed size buffer and length - it doesn't do the fetching. The > > caller does. > > > > What you wanna do: > > > > > len = min(15, remaining bytes in page); > > > fetch len bytes; > > > insn_init(); > > > ret = insn_decode_fully(); > > > > <--- you can't always know here whether the insn is valid if you don't > > have all the bytes. But you can always fetch *all* bytes and then give > > it to the decoder for checking. > > > > Also, this doesn't make any sense: try insn decode on a subset of bytes > > and then if it fails, try it on the whole set of bytes. Why even try the > > subset - it will almost always fail. > > I disagree. A real CPU does exactly what I'm describing. If I stick > 0xcc at the end of a page and a make the next page not-present, I get > #BP, not #PF. But if I stick 0x0F at the end of a page and mark the > next page not-present, I get #PF. If we're trying to decode an > instruction in user memory, we can kludge it by trying to fetch 15 > bytes and handling -EFAULT by fetching fewer bytes, but that's gross > and doesn't really have the right semantics. What we actually want is > to fetch up to the page boundary and try to decode it. If it's a > valid instruction or if it's definitely invalid, we're done. > Otherwise we fetch across the page boundary. > > Eventually we should wrap this whole mess up in an insn_decode_user() > helper that does the right thing. And we can then make that helper > extra fancy by getting PKRU and EPT-hacker-execute-only right, whereas > we currently get these cases wrong. +1. To handle the user-space (untrusted) instruction, we need to take more care about page boundary and presense. Also less side-effect is perferrable. Thank you, -- Masami Hiramatsu <mhira...@kernel.org>