Re: [PATCH 1/2 v2] kprobe: Do not use uaccess functions to access kernel memory that can fault

Nadav Amit Fri, 22 Feb 2019 14:08:43 -0800

> On Feb 22, 2019, at 1:43 PM, Jann Horn <[email protected]> wrote:
> 
> (adding some people from the text_poke series to the thread, removing stable@)
> 
> On Fri, Feb 22, 2019 at 8:55 PM Andy Lutomirski <[email protected]> wrote:
>>> On Feb 22, 2019, at 11:34 AM, Alexei Starovoitov 
>>> <[email protected]> wrote:
>>>> On Fri, Feb 22, 2019 at 02:30:26PM -0500, Steven Rostedt wrote:
>>>> On Fri, 22 Feb 2019 11:27:05 -0800
>>>> Alexei Starovoitov <[email protected]> wrote:
>>>> 
>>>>>> On Fri, Feb 22, 2019 at 09:43:14AM -0800, Linus Torvalds wrote:
>>>>>> 
>>>>>> Then we should still probably fix up "__probe_kernel_read()" to not
>>>>>> allow user accesses. The easiest way to do that is actually likely to
>>>>>> use the "unsafe_get_user()" functions *without* doing a
>>>>>> uaccess_begin(), which will mean that modern CPU's will simply fault
>>>>>> on a kernel access to user space.
>>>>> 
>>>>> On bpf side the bpf_probe_read() helper just calls probe_kernel_read()
>>>>> and users pass both user and kernel addresses into it and expect
>>>>> that the helper will actually try to read from that address.
>>>>> 
>>>>> If __probe_kernel_read will suddenly start failing on all user addresses
>>>>> it will break the expectations.
>>>>> How do we solve it in bpf_probe_read?
>>>>> Call probe_kernel_read and if that fails call unsafe_get_user byte-by-byte
>>>>> in the loop?
>>>>> That's doable, but people already complain that bpf_probe_read() is slow
>>>>> and shows up in their perf report.
>>>> 
>>>> We're changing kprobes to add a specific flag to say that we want to
>>>> differentiate between kernel or user reads. Can this be done with
>>>> bpf_probe_read()? If it's showing up in perf report, I doubt a single
>>> 
>>> so you're saying you will break existing kprobe scripts?
>>> I don't think it's a good idea.
>>> It's not acceptable to break bpf_probe_read uapi.
>> 
>> If so, the uapi is wrong: a long-sized number does not reliably identify an 
>> address if you don’t separately know whether it’s a user or kernel address. 
>> s390x and 4G:4G x86_32 are the notable exceptions. I have lobbied for RISC-V 
>> and future x86_64 to join the crowd.  I don’t know whether I’ll win this 
>> fight, but the uapi will probably have to change for at least s390x.
>> 
>> What to do about existing scripts is a different question.
> 
> This lack of logical separation between user and kernel addresses
> might interact interestingly with the text_poke series, specifically
> "[PATCH v3 05/20] x86/alternative: Initialize temporary mm for
> patching" 
> (https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F20190221234451.17632-6-rick.p.edgecombe%40intel.com%2F&amp;data=02%7C01%7Cnamit%40vmware.com%7Cd44d6f0765dd49b20db708d6990ee7e8%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636864686717142892&amp;sdata=gVALdkEULEhj4iJNEWAGxyYWe2lxnHRdamW5ZA2A5RQ%3D&amp;reserved=0)
> and "[PATCH v3 06/20] x86/alternative: Use temporary mm for text
> poking" 
> (https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F20190221234451.17632-7-rick.p.edgecombe%40intel.com%2F&amp;data=02%7C01%7Cnamit%40vmware.com%7Cd44d6f0765dd49b20db708d6990ee7e8%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636864686717142892&amp;sdata=nu2J1FtJsZJmt53SKJz8C8ktWE9eycwdAA%2BiCi1TfCc%3D&amp;reserved=0),
> right? If someone manages to get a tracing BPF program to trigger in a
> task that has switched to the patching mm, could they use
> bpf_probe_write_user() - which uses probe_kernel_write() after
> checking that KERNEL_DS isn't active and that access_ok() passes - to
> overwrite kernel text that is mapped writable in the patching mm?


Yes, this is a good point. I guess text_poke() should be defined with
“__kprobes” and open-code memcpy.

Does it sound reasonable?

Re: [PATCH 1/2 v2] kprobe: Do not use uaccess functions to access kernel memory that can fault

Reply via email to