DavidSpickett added a comment.

First thing to note is that WriteRegister also behaves this way, but there it 
is more appropriate because it updates only part of the buffer before writing 
it out in its entirety. Useful to know where the pattern came from though.

You would need roughly the following per `WriteXYZ`:

  -      error = WriteTLS();
  +      error = WriteTLS(src);
  
  -Status NativeRegisterContextLinux_arm64::WriteTLS() {
  +Status NativeRegisterContextLinux_arm64::WriteTLS(const void* 
src/*=nullptr*/) {
  
  -  ioVec.iov_base = GetTLSBuffer();
  +  ioVec.iov_base = src ? const_cast<void*>(src) : GetTLSBuffer();
  
  -  Status WriteTLS();
  +  Status WriteTLS(const void* src=nullptr);

We can assume that the buffer is the same as the data to be written back if 
it's something static like TLS. For SVE/SME, we would have resized our buffer 
first, so it holds there too.

The added complexity isn't that much but I think it adds more thinking time for 
a developer than it potentially saves in copying time. I already feel like the 
separate `xyz_is_valid` is enough to think about and having a potential second 
data source just adds to that load.

From the header docs it seems I was right that this is used primarily for 
expression evaluation:

  // These two functions are used to implement "push" and "pop" of register
  // states.  They are used primarily for expression evaluation, where we need
  // to push a new state (storing the old one in data_sp) and then restoring
  // the original state by passing the data_sp we got from ReadAllRegisters to
  // WriteAllRegisterValues.

Which you would be doing a lot of in a formatter for example, but you'd get 
better savings implementing a more efficient packet format to do all that at 
once, I guess.

QSaveRegisterState / QRestoreRegisterState packets call it as part of 
expression evaluation, though in theory it's not always that. That's an lldb 
extension anyway so we're in control of it at least. In theory this could be 
used to restore state that is not just the previous state but I don't know how 
you'd trigger that from lldb.

The other use is `NativeProcessLinux::Syscall` which is sufficiently rare we 
can ignore that.

I did do a very rough benchmark where I printed the same expression 2000 times, 
so each one is doing a save/restore. Once with the code in this review right 
now, and again with this potential optimisation added to GPR/FPR/TLS (I'm on a 
Mountain Jade machine without SVE). Caveat shared machine, made up benchmark, 
etc. but all runs of both hovered between 16 and 17 seconds. Neither seemed to 
be consistently lower or higher than the other. Doesn't mean this isn't a 
speedup in isolation but if it is, it's dwarfed by the syscalls and packets 
sent back and forth.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156687/new/

https://reviews.llvm.org/D156687

_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to