On Mon, 2022-09-26 at 14:33 +0000, Christophe Leroy wrote: > > +#define patch_memory(addr, val) \ > > +({ \ > > + BUILD_BUG_ON(!__native_word(val)); \ > > + __patch_memory(addr, (unsigned long) val, sizeof(val)); \ > > +}) > > Can you do a static __always_inline function instead of a macro here > ?
I didn't before because it doesn't allow using the type as a parameter. I considered these forms patch_memory(addr, val, 8); patch_memory(addr, val, void*); patch_memory(addr, val); // size taken from val type And thought the third was the nicest to use. Though coming back to this, I hadn't considered patch_memory(addr, val, sizeof(void*)) which would still allow a type to decide the size, and not be a macro. I've got an example implementation further down that also addresses the size check issue. > > +static int __always_inline ___patch_memory(void *patch_addr, > > + unsigned long data, > > + void *prog_addr, > > + size_t size) > > Is it really needed in the .c file ? I would expect GCC to take the > right decision by itself. I thought it'd be better to always inline it given it's only used generically in do_patch_memory and __do_patch_memory, which both get inlined into __patch_memory. But it does end up generating two copies due to the different contexts it's called in, so probably not worth it. Removed for v3. (raw_patch_instruction gets an optimised inline of ___patch_memory either way) > A BUILD_BUG() would be better here I think. BUILD_BUG() as the default case always triggers though, I assume because the constant used for size is too far away. How about static __always_inline int patch_memory(void *addr, unsigned long val, size_t size) { int __patch_memory(void *dest, unsigned long src, size_t size); BUILD_BUG_ON_MSG(!(size == sizeof(char) || size == sizeof(short) || size == sizeof(int) || size == sizeof(long)), "Unsupported size for patch_memory"); return __patch_memory(addr, val, size); } Declaring the __patch_memory function inside of patch_memory enforces that you can't accidentally call __patch_memory without going through this or the *patch_instruction entry points (which hardcode the size). > > + } > > > > - __put_kernel_nofault(patch_addr, &val, u32, > > failed); > > - } else { > > - u64 val = ppc_inst_as_ulong(instr); > > + dcbst(patch_addr); > > + dcbst(patch_addr + size - 1); /* Last byte of data may > > cross a cacheline */ > > Or the second byte of data may cross a cacheline ... It might, but unless we are assuming data cachelines smaller than the native word size it will either be in the first or last byte's cacheline. Whereas the last byte might be in it's own cacheline. As justification the comment's misleading though, how about reducing it to "data may cross a cacheline" and leaving the reason for flushing the last byte implicit? > > -static int __do_patch_instruction(u32 *addr, ppc_inst_t instr) > > +static int __always_inline __do_patch_memory(void *dest, unsigned > > long src, size_t size) > > { > > Whaou, do we really want all this to be __always_inline ? Did you > check > the text size increase ? These ones are redundant because GCC will already inline them, they were just part of experimenting inlining ___patch_memory. Will remove for v3. The text size doesn't increase though because the call hierarchy is just a linear chain of __patch_memory -> do_patch_memory -> __do_patch_memory The entry point __patch_memory is not inlined.