labath added a comment.
At this point, I think I (finally) have a good understanding of both how this
patch works and interacts with the rest of the world. I have one more batch of
comments, but hopefully none are too controversial, and I really do hope this
is the last iteration.
================
Comment at: lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp:289
+
+ lldb_private::RegisterInfo *new_reg_info_p = reg_info_ref.data();
+
----------------
I think all uses of `new_reg_info_p` could just be replaced by `reg_info_ref`
================
Comment at:
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp:63-66
+ m_sve_note_payload.resize(m_sveregset.GetByteSize());
+ ::memcpy(GetSVEBuffer(), m_sveregset.GetDataStart(),
+ m_sveregset.GetByteSize());
+ ::memcpy(&m_sve_header, m_sveregset.GetDataStart(), sizeof(m_sve_header));
----------------
What's up with this copying? We already have the data in the `m_sveregset`
DataExtractor. What's the reason for copying it into the `m_sve_note_payload`
vector? Also, `m_sve_header` seems like it could just be a `reinterpret_cast`ed
pointer into that buffer instead of a copy. (Maybe not even a pointer, just a
utility function which performs the cast when called).
Actually, when I think about casts and data extractors, I am reminded of
endianness. This will access those fields using host endianness, which is most
likely not what we want to do. So, probably the best/simplest solution would be
to indeed declare a `user_sve_header` struct, but don't populate it via memcpy,
but rather via the appropriate DataExtractor extraction methods. Since the only
field of the user_sve_header used outside this function is the `vl` field,
perhaps the struct could be a local variable and only the vector length would
be persisted. That would be consistent with how the `flags` field is decoded
and stored into the `m_sve_state` field.
(If the struct fields are always little-endian (if that's true, I thought arm
has BE variants) then you could also stick to the `reinterpret_cast` idea, but
change the types of the struct fields to `llvm::support::ulittleXX_t` to read
them as LE independently of the host.)
================
Comment at:
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp:84-86
+ const uint32_t reg = reg_info->kinds[lldb::eRegisterKindLLDB];
+ if (reg == LLDB_INVALID_REGNUM)
+ return false;
----------------
This is already checked by ReadRegister and the false -> uint32_t conversion is
dodgy. An assertion would completely suffice here.
================
Comment at:
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp:140
+ offset -= GetGPRSize();
+ if (IsFPR(reg) && offset < m_fpregset.GetByteSize()) {
+ value.SetFromMemoryData(reg_info, m_fpregset.GetDataStart() + offset,
----------------
This `IsFPR` check is now redundant.
================
Comment at:
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp:152-153
+ sve_reg_num = reg_info->value_regs[0];
+ else if (reg == GetRegNumFPCR() || reg == GetRegNumFPSR())
+ sve_reg_num = reg;
+ if (sve_reg_num != LLDB_INVALID_REGNUM) {
----------------
These two registers are special-cased both here and in the
`CalculateSVEOffset`. Given that both functions also do a switch over the sve
"states", it makes following the code quite challenging. What if we moved the
handling of these registers completely up-front, and removed their handling
from `CalculateSVEOffset` completely?
I'm thinking of something like:
```
if (reg == GetRegNumFPCR() && m_sve_state != SVEState::Disabled) {
src = GetSVEBuffer() + GetFPCROffset(); // Or maybe just inline GetFPCROffset
here
} else if (reg == GetRegNumFPSR() && m_sve_state != SVEState::Disabled) {
src = ...;
} if (IsFPR(reg)) {
// as usual, only you can assume that FP[CS]R are already handled if SVE is
enabled
}
```
================
Comment at:
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp:161-162
+ } else if (IsSVEVG(reg)) {
+ sve_vg = GetSVERegVG();
+ src = (uint8_t *)&sve_vg;
+ } else if (IsSVE(reg)) {
----------------
This also looks endian-incorrect. Maybe `value = GetSVERegVG(); return true;` ?
================
Comment at:
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp:164
+ } else if (IsSVE(reg)) {
+ if (m_sve_state != SVEState::Disabled) {
+ if (m_sve_state == SVEState::FPSIMD) {
----------------
A switch over possible `m_sve_state` values would likely be cleaner.
================
Comment at:
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp:174
+ ::memcpy(sve_reg_non_live.data(),
+ (const uint8_t *)GetSVEBuffer() + offset, 16);
+ }
----------------
I guess it would be better to do the cast inside GetSVEBuffer. (and please make
that a c++ reinterpret_cast).
================
Comment at:
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp:180
+ assert(offset < m_sveregset.GetByteSize());
+ src = (uint8_t *)GetSVEBuffer() + offset;
+ }
----------------
it seems that `src` should be a const pointer.
================
Comment at:
lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp:82-106
+uint32_t
+RegisterContextCorePOSIX_arm64::CalculateSVEOffset(uint32_t reg_num) const {
+ uint32_t sve_offset = 0;
+ if (m_sve_state == SVE_STATE::SVE_STATE_FPSIMD) {
+ if (IsSVEZ(reg_num))
+ sve_offset = (reg_num - GetRegNumSVEZ0()) * 16;
+ else if (reg_num == GetRegNumFPSR())
----------------
omjavaid wrote:
> labath wrote:
> > omjavaid wrote:
> > > labath wrote:
> > > > omjavaid wrote:
> > > > > labath wrote:
> > > > > > omjavaid wrote:
> > > > > > > labath wrote:
> > > > > > > > I'm confused by this function. If I understant the SVE_PT
> > > > > > > > macros and the logic in
> > > > > > > > `RegisterInfoPOSIX_arm64::ConfigureVectorRegisterInfos`
> > > > > > > > correctly, then they both seem to encode the same information.
> > > > > > > > And it seems to me that this function should just be
> > > > > > > > `reg_infos[reg_num].offset - some_constant`, which is the same
> > > > > > > > thing that we're doing for the arm FP registers when SVE is
> > > > > > > > disabled, and also for most other architectures too.
> > > > > > > >
> > > > > > > > Why is that not the case? Am I missing something? If they are
> > > > > > > > not encoding the same thing, could they be made to encode the
> > > > > > > > same thing?
> > > > > > > This function calculates offset of a particular register in core
> > > > > > > note data. SVE data in core dump is similar to what PTrace emits
> > > > > > > and offsets into this data is not linear. SVE macros are used to
> > > > > > > access those offsets based on register numbers and currently
> > > > > > > selected vector length.
> > > > > > > Also for the purpose of ease we have linear offsets in SVE
> > > > > > > register infos and it helps us simplify register data once it
> > > > > > > makes way to GDBRemoteRegisterContext on the client side.
> > > > > > Could you give an example of the non-linearity of the core dump
> > > > > > data? (Some registers, and their respective core file and
> > > > > > gdb-remote offsets)
> > > > > In case of core file we create a buffer m_sveregset which stores SVE
> > > > > core note information
> > > > > m_sveregset =
> > > > > getRegset(notes,
> > > > > m_register_info_up->GetTargetArchitecture().GetTriple(),
> > > > > AARCH64_SVE_Desc);
> > > > >
> > > > > At this point we do not know what was the vector length and at what
> > > > > offsets in the data our registers are located. We read top bytes of
> > > > > size use_sve_header and read vector length. Based on this information
> > > > > we configure vector length in Register infos. While the SVE payload
> > > > > starts with user_sve_header then there are some allignment bytes
> > > > > followed by vector length based Z registers followed by P and FFR,
> > > > > then there are some more allginment bytes followd by FPCR and FPSR.
> > > > > Macros provided by Linux help us jump to the desired offset by
> > > > > providing register number and vq into the core note or Ptrace payload.
> > > > >
> > > > > In case of client side storage we store GPRs at linear offset
> > > > > followed by Vector Granule register. Then there are SVE registers Z,
> > > > > P, FFR, FPSR and FPCR. Offsets of V, D and S registers in FPR regset
> > > > > overlap with corresponding first bytes of Z registers and will be
> > > > > same as corresponding Z register. While both FP/SVE FPSR share same
> > > > > register offset, size and register number.
> > > > >
> > > > > Here is an excerpt from
> > > > > https://github.com/torvalds/linux/blob/master/Documentation/arm64/sve.rst
> > > > >
> > > > > SVE_PT_REGS_FPSIMD
> > > > > SVE registers are not live (GETREGSET) or are to be made non-live
> > > > > (SETREGSET).
> > > > > The payload is of type struct user_fpsimd_state, with the same
> > > > > meaning as for NT_PRFPREG, starting at offset SVE_PT_FPSIMD_OFFSET
> > > > > from the start of user_sve_header.
> > > > > Extra data might be appended in the future: the size of the payload
> > > > > should be obtained using SVE_PT_FPSIMD_SIZE(vq, flags).
> > > > > vq should be obtained using sve_vq_from_vl(vl).
> > > > >
> > > > > or
> > > > >
> > > > > SVE_PT_REGS_SVE
> > > > > SVE registers are live (GETREGSET) or are to be made live (SETREGSET).
> > > > > The payload contains the SVE register data, starting at offset
> > > > > SVE_PT_SVE_OFFSET from the start of user_sve_header, and with size
> > > > > SVE_PT_SVE_SIZE(vq, flags);
> > > > Given this
> > > > > SVE payload starts with ... followed by vector length based Z
> > > > > registers followed by P and FFR,
> > > > and this
> > > > > In case of client side storage we store GPRs ... Then there are SVE
> > > > > registers Z, P, FFR, FPSR and FPCR
> > > > I would expect that for each of the Z, P and FFR registers, the
> > > > expression `offset_in_core(reg) - offset_in_gdb_remote(reg)` is always
> > > > the same constant (and is basically equal to
> > > > `SVE_PT_SVE_ZREG_OFFSET(vq, 0) - reg_info[Z0].byte_offset`). So we
> > > > could just add/subtract that constant to the gdb-remote byte_offset
> > > > field instead of painstakingly decomposing the register number only for
> > > > the linux macros to reconstruct it back again. Is that not so?
> > > The standard never talks about Z, P and FFR being contagious that is what
> > > I learnt by reading macros. There standard states this:
> > >
> > > If the registers are present, the remainder of the record has a
> > > vl-dependent size and layout. Macros SVE_SIG_* are defined [1] to
> > > facilitate access to the members.
> > > Each scalable register (Zn, Pn, FFR) is stored in an endianness-invariant
> > > layout, with bits [(8 * i + 7) : (8 * i)] stored at byte offset i from
> > > the start of the register's representation in memory.
> > > If the SVE context is too big to fit in sigcontext.__reserved[], then
> > > extra space is allocated on the stack, an extra_context record is written
> > > in __reserved[] referencing this space. sve_context is then written in
> > > the extra space. Refer to [1] for further details about this mechanism.
> > >
> > > I understand what you are talking about but given the macros were
> > > specifically provided and above line about register record was vague and
> > > I thought best is to follow the macros for offset calculation although
> > > other way around is simpler but may be slightly unreliable.
> > >
> > > I suggest to keep this as it is unless there is strong reason apart from
> > > slight performance penalty. This resembles with GDB implementation which
> > > was done by ARM and I am only following that as reference. May be we can
> > > revise this in future when the feature becomes more mainstream.
> > >
> > I'm not worried about the performance penalty (most of these abstractions
> > can be optimized away anyway). I'm more worried about the maintainability
> > penalty incurred by a non-standard and fairly complicated solution.
> >
> > The way I see it, it doesn't really matter how exactly the specification
> > describes these things. Having the macros is nice, but it's not their names
> > that become a part of the ABI -- their implementation does. So they (linux,
> > arm, whoever) cannot change the layout of the these registers without
> > breaking all existing applications (and core files) any more than they can
> > change the layout of the general purpose registers (which we also access by
> > offset, without any fancy macros). In fact, even if they did change the
> > layout, given that we have a copy of these headers, we wouldn't
> > automatically inherit those changes anyway, but would have to take some
> > manual action. And since we'd probably want to maintain some sort of
> > backwards compatibility, we couldn't just replace the new macro
> > definitions, but would have to do some sort of versioning (just today I got
> > reminded that lldb contains versioned layouts of various internal structs
> > used by the ObjC runtime).
> >
> > So, for that reason I am more concerned about consistency with other lldb
> > register contexts than I am with consistency with gdb.
> >
> > Also, I bet gdb doesn't have an equivalent of our RegisterInfo struct.
> > Without that, using these macros would be obviously better, but given that
> > we have that, and we already use it to access other registers, ditching
> > that for macros does not seem like a win to me.
> >
> > Another way to look at this would be to compare this to e.g. ELF reading
> > code in llvm. We cannot just include <linux/elf.h> as we want this to be
> > cross-platform. However, we also don't just copy the header blindly into
> > our code base. We don't depart from it needlessly, but we do make sure that
> > it plays well with the rest of llvm -- `#defines` are changed to enums, we
> > add templates to select between 64/32 bit versions, change some
> > platform-specific names so that different platforms may co-exist, etc.
> I understand your point here and by referring to Linux, Arm or GDB
> implementation I only wanted to make sure that we are implementing the
> architecture and platform specific parts correctly. The choice of using
> Linux/SVE specific macros in elf-core context was not a blind decision. I am
> not advocating for following a particular implementations and I would not
> have pulled in these Linux specific macros into elf-core implementation if
> the SVE core note layout was independent of Linux platform specifics like
> sig_context and user_sve_header. Offset to the start of register data in
> NT_ARM_SVE is also calculated after doing alignment correction based on VQ
> and size of sig_context. Given that SVE is currently supported by Linux only,
> NT_ARM_SVE is Linux only core note and core dump generated makes use of these
> macro definitions I had no choice but to include them for extraction of data
> from NT_ARM_SVE section. If I had not followed these macros I still would
> have written something similar my self for register data extraction from
> NT_ARM_SVE elf core note.
>
> I am going to post another update where I ll try to minimize use of these
> macros in offset calculation.
Thanks for your patience. I think this looks better. The more I understand how
these work, the more I realize that will need fairly special handling..
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D77047/new/
https://reviews.llvm.org/D77047
_______________________________________________
lldb-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits