On Thu, 1 Sep 2022 17:20:33 GMT, Chris Plummer <cjplum...@openjdk.org> wrote:

>> While dumping all registers (and doing a findpc on each), the following 
>> exception occurred for r8:
>> 
>> 
>>  r8: 0x000000750e4fdffc
>> Error: java.lang.ArrayIndexOutOfBoundsException: Index 4099 out of bounds 
>> for length 4096
>> java.lang.ArrayIndexOutOfBoundsException: Index 4099 out of bounds for 
>> length 4096
>> at jdk.hotspot.agent/sun.jvm.hotspot.debugger.Page.getLong(Page.java:182)
>> at 
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.PageCache.getLong(PageCache.java:100)
>> at 
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readCInteger(DebuggerBase.java:364)
>> at 
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readAddressValue(DebuggerBase.java:462)
>> at 
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.readAddress(WindbgDebuggerLocal.java:312)
>> at 
>> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgAddress.getAddressAt(WindbgAddress.java:71)
>> at 
>> jdk.hotspot.agent/sun.jvm.hotspot.utilities.PointerFinder.find(PointerFinder.java:58)
>> at 
>> jdk.hotspot.agent/sun.jvm.hotspot.runtime.JavaThread.printThreadContextOn(JavaThread.java:493)
>> 
>> 
>> This exception is the result of using PointerFinder ("findpc") on invalid 
>> address that starts on the last 32-bit word of a page. "page" in this case 
>> is referring to a page in SA's PageCache, which are always 4k in size. Note 
>> findpc is frequently done using an invalid address. In this test case it is 
>> being called on each x64 register, which could contain anything. findpc 
>> relies on getting some sort of AddressException when this happens, and will 
>> then say the address points to something that is unknown. However, in the 
>> case of the address pointing to the last 32-bot word of a page, it results 
>> in an ArrayIndexOutOfBoundsException when the page cache tries to read past 
>> the end of the page. This is happening in Page.getLong(), which you can see 
>> in the stack trace.
>> 
>> The main culprit here is some weakening of the alignment checks being done. 
>> The alignment check should have resulted in an UnalignedAddressException 
>> long before we ever got to Page.getLong(). However, we have the following 
>> override, which is allowing the unaligned address to pass the alignment 
>> check.
>> 
>> 
>>            public void checkAlignment(long address, long alignment) {
>>              // Need to override default checkAlignment because we need to
>>              // relax alignment constraints on Bsd/x86
>>              if ( (address % alignment != 0)
>>                 &&(alignment != 8 || address % 4 != 0)) {
>>                 throw new UnalignedAddressException(
>>                         "Trying to read at address: "
>>                       + addressValueToString(address)
>>                       + " with alignment: " + alignment,
>>                         address);
>>              }
>>            }
>>         };
>> 
>> 
>> This allows a pointer to a 64-bit value to only be 32-bit aligned. But 
>> there's more to it than that. I modified ClhsdbFindPC.java to fetch a tid (a 
>> thread address), and did a findpc on the address OR'd with 0xffc. `findpc` 
>> uses PointerFinder. This forced a read of a 64-bit value that starts in the 
>> last 32-bits of a page. It passed on linux-x64 but failed on windowx-x64 in 
>> the same manner described in the description of this CR. The difference 
>> between the two implementations is that windows relies on the default 
>> implementation of DebuggerBase.readCInteger() whereas linux has an override. 
>> DebuggerBase.readCInteger() does the following:
>> 
>> 
>>   public long readCInteger(long address, long numBytes, boolean isUnsigned) {
>>     utils.checkAlignment(address, numBytes);
>>     if (useFastAccessors) {
>>       if (isUnsigned) {
>>         switch((int) numBytes) {
>>         case 1: return cache.getByte(address) & 0xFF;
>>         case 2: return cache.getShort(address, bigEndian) & 0xFFFF;
>>         case 4: return cache.getInt(address, bigEndian) & 0xFFFFFFFFL;
>>         case 8: return cache.getLong(address, bigEndian);
>> ...
>> 
>> 
>> There is an alignment check here, but it is the "relaxed" override shown 
>> above, which allows 64-bit addresses on 32-bit boundaries. The override in 
>> LinuxDebuggerLocal is:
>> 
>> 
>>     /** Need to override this to relax alignment checks on x86. */
>>     public long readCInteger(long address, long numBytes, boolean isUnsigned)
>>         throws UnmappedAddressException, UnalignedAddressException {
>>         // Only slightly relaxed semantics -- this is a hack, but is
>>         // necessary on x86 where it seems the compiler is
>>         // putting some global 64-bit data on 32-bit boundaries
>>         if (numBytes == 8) {
>>             utils.checkAlignment(address, 4);
>>         } else {
>>             utils.checkAlignment(address, numBytes);
>>         }
>>         byte[] data = readBytes(address, numBytes);
>>         return utils.dataToCInteger(data, isUnsigned);
>> 
>> 
>> Although there is a relaxed alignment check here also, the code that reads 
>> from the address does not assume all the bytes are on the same page. It uses 
>> readBytes() intead of cache.getLong(), so it won't run into the PageCache 
>> issue with reading a 64-bit value that starts in the last 32-bit word of a 
>> page.
>> 
>> I think the introduction of these relaxed alignment checks has a muddled 
>> history, probably made more muddled by ports cloning code that maybe wasn't 
>> necessary. Probably also initial fixes (the relaxed alignment check) seemed 
>> to work at first, but later the PageCache issue was discovered, leading to 
>> readBytes() workaround in the LinuxDebuggerLocal.readCInteger() override, 
>> but was not also done on other ports (so we got this CR for windows).
>> 
>> For 64-bit support it's clear this easing of the 64-bit alignment is not 
>> needed, and getting rid of it would result in the proper 
>> UnalignedAddressException being thrown. The question is whether it is still 
>> needed for 32-bit x86, and if so is it needed on all ports.
>> 
>> I can't test linux-x86, so I can't tell if it still allows 64-bit values on 
>> 32-bit aligned addresses, so for now I'll assume it does. So the approach 
>> being taken is whenever an address of a 64-bit type points to the last 
>> 32-bit word of a page, use readBytes() to get the 64-bit value one byte at a 
>> time. It still uses the page cache in the end, but it doesn't try to get all 
>> 8 bytes from the same page. Note for 64-bit systems, the conditoin will 
>> never arise because of the removal of the relaxed alignment check. Instead 
>> there will be an UnalignedAddressException at an early point when the 
>> alignment check is made.
>> 
>> One windfall of this change is now we always read 64-bit values from the 
>> page cache in a way that is much more efficient than reading a byte at a 
>> time. This has resulted in about a 25% performance improvement in a test I 
>> have that does a heap dump.
>
> Chris Plummer has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains four commits:
> 
>  - Merge
>  - Fix jcheck error
>  - Undo some temp test changes.
>  - Fix 64-bit alignment requirements

src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/remote/RemoteDebuggerClient.java
 line 46:

> 44:   private RemoteDebugger remoteDebugger;
> 45:   private RemoteThreadFactory threadFactory;
> 46:   private boolean unalignedAccessesOkay = false;

After getting rid of the `readCInteger` and `readJLong` overrides below, 
`unalignedAccessesOkay` is no longer needed. The `DebuggerBase` superclass 
understands the alignment requirements.

-------------

PR: https://git.openjdk.org/jdk/pull/10090

Reply via email to