Le 26/09/2023 à 00:50, Song Liu a écrit : > On Fri, Sep 8, 2023 at 6:28 AM Hari Bathini <hbath...@linux.ibm.com> wrote: >> >> patch_instruction() entails setting up pte, patching the instruction, >> clearing the pte and flushing the tlb. If multiple instructions need >> to be patched, every instruction would have to go through the above >> drill unnecessarily. Instead, introduce function patch_instructions() >> that sets up the pte, clears the pte and flushes the tlb only once per >> page range of instructions to be patched. This adds a slight overhead >> to patch_instruction() call while improving the patching time for >> scenarios where more than one instruction needs to be patched. >> >> Signed-off-by: Hari Bathini <hbath...@linux.ibm.com> > > I didn't see this one when I reviewed 1/5. Please ignore that comment.
If I remember correctry, patch 1 introduces a huge performance degradation, which gets then improved with this patch. As I said before, I'd expect patch 4 to go first then get bpf_arch_text_copy() be implemented with patch_instructions() directly. Christophe > > [...] > >> @@ -307,11 +312,22 @@ static int __do_patch_instruction_mm(u32 *addr, >> ppc_inst_t instr) >> >> orig_mm = start_using_temp_mm(patching_mm); >> >> - err = __patch_instruction(addr, instr, patch_addr); >> + while (len > 0) { >> + instr = ppc_inst_read(code); >> + ilen = ppc_inst_len(instr); >> + err = __patch_instruction(addr, instr, patch_addr); > > It appears we are still repeating a lot of work here. For example, with > fill_insn == true, we don't need to repeat ppc_inst_read(). > > Can we do this with a memcpy or memset like functions? > >> + /* hwsync performed by __patch_instruction (sync) if >> successful */ >> + if (err) { >> + mb(); /* sync */ >> + break; >> + } >> >> - /* hwsync performed by __patch_instruction (sync) if successful */ >> - if (err) >> - mb(); /* sync */ >> + len -= ilen; >> + patch_addr = patch_addr + ilen; >> + addr = (void *)addr + ilen; >> + if (!fill_insn) >> + code = code + ilen; > > It took me a while to figure out what "fill_insn" means. Maybe call it > "repeat_input" or something? > > Thanks, > Song > >> + } >> >> /* context synchronisation performed by __patch_instruction (isync >> or exception) */ >> stop_using_temp_mm(patching_mm, orig_mm); >> @@ -328,16 +344,21 @@ static int __do_patch_instruction_mm(u32 *addr, >> ppc_inst_t instr) >> return err; >> } >>