labath added a comment. At this point, I think I (finally) have a good understanding of both how this patch works and interacts with the rest of the world. I have one more batch of comments, but hopefully none are too controversial, and I really do hope this is the last iteration.
================ Comment at: lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp:289 + + lldb_private::RegisterInfo *new_reg_info_p = reg_info_ref.data(); + ---------------- I think all uses of `new_reg_info_p` could just be replaced by `reg_info_ref` ================ Comment at: lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp:63-66 + m_sve_note_payload.resize(m_sveregset.GetByteSize()); + ::memcpy(GetSVEBuffer(), m_sveregset.GetDataStart(), + m_sveregset.GetByteSize()); + ::memcpy(&m_sve_header, m_sveregset.GetDataStart(), sizeof(m_sve_header)); ---------------- What's up with this copying? We already have the data in the `m_sveregset` DataExtractor. What's the reason for copying it into the `m_sve_note_payload` vector? Also, `m_sve_header` seems like it could just be a `reinterpret_cast`ed pointer into that buffer instead of a copy. (Maybe not even a pointer, just a utility function which performs the cast when called). Actually, when I think about casts and data extractors, I am reminded of endianness. This will access those fields using host endianness, which is most likely not what we want to do. So, probably the best/simplest solution would be to indeed declare a `user_sve_header` struct, but don't populate it via memcpy, but rather via the appropriate DataExtractor extraction methods. Since the only field of the user_sve_header used outside this function is the `vl` field, perhaps the struct could be a local variable and only the vector length would be persisted. That would be consistent with how the `flags` field is decoded and stored into the `m_sve_state` field. (If the struct fields are always little-endian (if that's true, I thought arm has BE variants) then you could also stick to the `reinterpret_cast` idea, but change the types of the struct fields to `llvm::support::ulittleXX_t` to read them as LE independently of the host.) ================ Comment at: lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp:84-86 + const uint32_t reg = reg_info->kinds[lldb::eRegisterKindLLDB]; + if (reg == LLDB_INVALID_REGNUM) + return false; ---------------- This is already checked by ReadRegister and the false -> uint32_t conversion is dodgy. An assertion would completely suffice here. ================ Comment at: lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp:140 + offset -= GetGPRSize(); + if (IsFPR(reg) && offset < m_fpregset.GetByteSize()) { + value.SetFromMemoryData(reg_info, m_fpregset.GetDataStart() + offset, ---------------- This `IsFPR` check is now redundant. ================ Comment at: lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp:152-153 + sve_reg_num = reg_info->value_regs[0]; + else if (reg == GetRegNumFPCR() || reg == GetRegNumFPSR()) + sve_reg_num = reg; + if (sve_reg_num != LLDB_INVALID_REGNUM) { ---------------- These two registers are special-cased both here and in the `CalculateSVEOffset`. Given that both functions also do a switch over the sve "states", it makes following the code quite challenging. What if we moved the handling of these registers completely up-front, and removed their handling from `CalculateSVEOffset` completely? I'm thinking of something like: ``` if (reg == GetRegNumFPCR() && m_sve_state != SVEState::Disabled) { src = GetSVEBuffer() + GetFPCROffset(); // Or maybe just inline GetFPCROffset here } else if (reg == GetRegNumFPSR() && m_sve_state != SVEState::Disabled) { src = ...; } if (IsFPR(reg)) { // as usual, only you can assume that FP[CS]R are already handled if SVE is enabled } ``` ================ Comment at: lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp:161-162 + } else if (IsSVEVG(reg)) { + sve_vg = GetSVERegVG(); + src = (uint8_t *)&sve_vg; + } else if (IsSVE(reg)) { ---------------- This also looks endian-incorrect. Maybe `value = GetSVERegVG(); return true;` ? ================ Comment at: lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp:164 + } else if (IsSVE(reg)) { + if (m_sve_state != SVEState::Disabled) { + if (m_sve_state == SVEState::FPSIMD) { ---------------- A switch over possible `m_sve_state` values would likely be cleaner. ================ Comment at: lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp:174 + ::memcpy(sve_reg_non_live.data(), + (const uint8_t *)GetSVEBuffer() + offset, 16); + } ---------------- I guess it would be better to do the cast inside GetSVEBuffer. (and please make that a c++ reinterpret_cast). ================ Comment at: lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp:180 + assert(offset < m_sveregset.GetByteSize()); + src = (uint8_t *)GetSVEBuffer() + offset; + } ---------------- it seems that `src` should be a const pointer. ================ Comment at: lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp:82-106 +uint32_t +RegisterContextCorePOSIX_arm64::CalculateSVEOffset(uint32_t reg_num) const { + uint32_t sve_offset = 0; + if (m_sve_state == SVE_STATE::SVE_STATE_FPSIMD) { + if (IsSVEZ(reg_num)) + sve_offset = (reg_num - GetRegNumSVEZ0()) * 16; + else if (reg_num == GetRegNumFPSR()) ---------------- omjavaid wrote: > labath wrote: > > omjavaid wrote: > > > labath wrote: > > > > omjavaid wrote: > > > > > labath wrote: > > > > > > omjavaid wrote: > > > > > > > labath wrote: > > > > > > > > I'm confused by this function. If I understant the SVE_PT > > > > > > > > macros and the logic in > > > > > > > > `RegisterInfoPOSIX_arm64::ConfigureVectorRegisterInfos` > > > > > > > > correctly, then they both seem to encode the same information. > > > > > > > > And it seems to me that this function should just be > > > > > > > > `reg_infos[reg_num].offset - some_constant`, which is the same > > > > > > > > thing that we're doing for the arm FP registers when SVE is > > > > > > > > disabled, and also for most other architectures too. > > > > > > > > > > > > > > > > Why is that not the case? Am I missing something? If they are > > > > > > > > not encoding the same thing, could they be made to encode the > > > > > > > > same thing? > > > > > > > This function calculates offset of a particular register in core > > > > > > > note data. SVE data in core dump is similar to what PTrace emits > > > > > > > and offsets into this data is not linear. SVE macros are used to > > > > > > > access those offsets based on register numbers and currently > > > > > > > selected vector length. > > > > > > > Also for the purpose of ease we have linear offsets in SVE > > > > > > > register infos and it helps us simplify register data once it > > > > > > > makes way to GDBRemoteRegisterContext on the client side. > > > > > > Could you give an example of the non-linearity of the core dump > > > > > > data? (Some registers, and their respective core file and > > > > > > gdb-remote offsets) > > > > > In case of core file we create a buffer m_sveregset which stores SVE > > > > > core note information > > > > > m_sveregset = > > > > > getRegset(notes, > > > > > m_register_info_up->GetTargetArchitecture().GetTriple(), > > > > > AARCH64_SVE_Desc); > > > > > > > > > > At this point we do not know what was the vector length and at what > > > > > offsets in the data our registers are located. We read top bytes of > > > > > size use_sve_header and read vector length. Based on this information > > > > > we configure vector length in Register infos. While the SVE payload > > > > > starts with user_sve_header then there are some allignment bytes > > > > > followed by vector length based Z registers followed by P and FFR, > > > > > then there are some more allginment bytes followd by FPCR and FPSR. > > > > > Macros provided by Linux help us jump to the desired offset by > > > > > providing register number and vq into the core note or Ptrace payload. > > > > > > > > > > In case of client side storage we store GPRs at linear offset > > > > > followed by Vector Granule register. Then there are SVE registers Z, > > > > > P, FFR, FPSR and FPCR. Offsets of V, D and S registers in FPR regset > > > > > overlap with corresponding first bytes of Z registers and will be > > > > > same as corresponding Z register. While both FP/SVE FPSR share same > > > > > register offset, size and register number. > > > > > > > > > > Here is an excerpt from > > > > > https://github.com/torvalds/linux/blob/master/Documentation/arm64/sve.rst > > > > > > > > > > SVE_PT_REGS_FPSIMD > > > > > SVE registers are not live (GETREGSET) or are to be made non-live > > > > > (SETREGSET). > > > > > The payload is of type struct user_fpsimd_state, with the same > > > > > meaning as for NT_PRFPREG, starting at offset SVE_PT_FPSIMD_OFFSET > > > > > from the start of user_sve_header. > > > > > Extra data might be appended in the future: the size of the payload > > > > > should be obtained using SVE_PT_FPSIMD_SIZE(vq, flags). > > > > > vq should be obtained using sve_vq_from_vl(vl). > > > > > > > > > > or > > > > > > > > > > SVE_PT_REGS_SVE > > > > > SVE registers are live (GETREGSET) or are to be made live (SETREGSET). > > > > > The payload contains the SVE register data, starting at offset > > > > > SVE_PT_SVE_OFFSET from the start of user_sve_header, and with size > > > > > SVE_PT_SVE_SIZE(vq, flags); > > > > Given this > > > > > SVE payload starts with ... followed by vector length based Z > > > > > registers followed by P and FFR, > > > > and this > > > > > In case of client side storage we store GPRs ... Then there are SVE > > > > > registers Z, P, FFR, FPSR and FPCR > > > > I would expect that for each of the Z, P and FFR registers, the > > > > expression `offset_in_core(reg) - offset_in_gdb_remote(reg)` is always > > > > the same constant (and is basically equal to > > > > `SVE_PT_SVE_ZREG_OFFSET(vq, 0) - reg_info[Z0].byte_offset`). So we > > > > could just add/subtract that constant to the gdb-remote byte_offset > > > > field instead of painstakingly decomposing the register number only for > > > > the linux macros to reconstruct it back again. Is that not so? > > > The standard never talks about Z, P and FFR being contagious that is what > > > I learnt by reading macros. There standard states this: > > > > > > If the registers are present, the remainder of the record has a > > > vl-dependent size and layout. Macros SVE_SIG_* are defined [1] to > > > facilitate access to the members. > > > Each scalable register (Zn, Pn, FFR) is stored in an endianness-invariant > > > layout, with bits [(8 * i + 7) : (8 * i)] stored at byte offset i from > > > the start of the register's representation in memory. > > > If the SVE context is too big to fit in sigcontext.__reserved[], then > > > extra space is allocated on the stack, an extra_context record is written > > > in __reserved[] referencing this space. sve_context is then written in > > > the extra space. Refer to [1] for further details about this mechanism. > > > > > > I understand what you are talking about but given the macros were > > > specifically provided and above line about register record was vague and > > > I thought best is to follow the macros for offset calculation although > > > other way around is simpler but may be slightly unreliable. > > > > > > I suggest to keep this as it is unless there is strong reason apart from > > > slight performance penalty. This resembles with GDB implementation which > > > was done by ARM and I am only following that as reference. May be we can > > > revise this in future when the feature becomes more mainstream. > > > > > I'm not worried about the performance penalty (most of these abstractions > > can be optimized away anyway). I'm more worried about the maintainability > > penalty incurred by a non-standard and fairly complicated solution. > > > > The way I see it, it doesn't really matter how exactly the specification > > describes these things. Having the macros is nice, but it's not their names > > that become a part of the ABI -- their implementation does. So they (linux, > > arm, whoever) cannot change the layout of the these registers without > > breaking all existing applications (and core files) any more than they can > > change the layout of the general purpose registers (which we also access by > > offset, without any fancy macros). In fact, even if they did change the > > layout, given that we have a copy of these headers, we wouldn't > > automatically inherit those changes anyway, but would have to take some > > manual action. And since we'd probably want to maintain some sort of > > backwards compatibility, we couldn't just replace the new macro > > definitions, but would have to do some sort of versioning (just today I got > > reminded that lldb contains versioned layouts of various internal structs > > used by the ObjC runtime). > > > > So, for that reason I am more concerned about consistency with other lldb > > register contexts than I am with consistency with gdb. > > > > Also, I bet gdb doesn't have an equivalent of our RegisterInfo struct. > > Without that, using these macros would be obviously better, but given that > > we have that, and we already use it to access other registers, ditching > > that for macros does not seem like a win to me. > > > > Another way to look at this would be to compare this to e.g. ELF reading > > code in llvm. We cannot just include <linux/elf.h> as we want this to be > > cross-platform. However, we also don't just copy the header blindly into > > our code base. We don't depart from it needlessly, but we do make sure that > > it plays well with the rest of llvm -- `#defines` are changed to enums, we > > add templates to select between 64/32 bit versions, change some > > platform-specific names so that different platforms may co-exist, etc. > I understand your point here and by referring to Linux, Arm or GDB > implementation I only wanted to make sure that we are implementing the > architecture and platform specific parts correctly. The choice of using > Linux/SVE specific macros in elf-core context was not a blind decision. I am > not advocating for following a particular implementations and I would not > have pulled in these Linux specific macros into elf-core implementation if > the SVE core note layout was independent of Linux platform specifics like > sig_context and user_sve_header. Offset to the start of register data in > NT_ARM_SVE is also calculated after doing alignment correction based on VQ > and size of sig_context. Given that SVE is currently supported by Linux only, > NT_ARM_SVE is Linux only core note and core dump generated makes use of these > macro definitions I had no choice but to include them for extraction of data > from NT_ARM_SVE section. If I had not followed these macros I still would > have written something similar my self for register data extraction from > NT_ARM_SVE elf core note. > > I am going to post another update where I ll try to minimize use of these > macros in offset calculation. Thanks for your patience. I think this looks better. The more I understand how these work, the more I realize that will need fairly special handling.. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77047/new/ https://reviews.llvm.org/D77047 _______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits