stderr" [v2]

Chris Plummer Thu, 01 Sep 2022 10:21:06 -0700

> While dumping all registers (and doing a findpc on each), the following 
> exception occurred for r8:
> 
> 
>  r8: 0x000000750e4fdffc
> Error: java.lang.ArrayIndexOutOfBoundsException: Index 4099 out of bounds for 
> length 4096
> java.lang.ArrayIndexOutOfBoundsException: Index 4099 out of bounds for length 
> 4096
> at jdk.hotspot.agent/sun.jvm.hotspot.debugger.Page.getLong(Page.java:182)
> at 
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.PageCache.getLong(PageCache.java:100)
> at 
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readCInteger(DebuggerBase.java:364)
> at 
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.DebuggerBase.readAddressValue(DebuggerBase.java:462)
> at 
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.readAddress(WindbgDebuggerLocal.java:312)
> at 
> jdk.hotspot.agent/sun.jvm.hotspot.debugger.windbg.WindbgAddress.getAddressAt(WindbgAddress.java:71)
> at 
> jdk.hotspot.agent/sun.jvm.hotspot.utilities.PointerFinder.find(PointerFinder.java:58)
> at 
> jdk.hotspot.agent/sun.jvm.hotspot.runtime.JavaThread.printThreadContextOn(JavaThread.java:493)
> 
> 
> This exception is the result of using PointerFinder ("findpc") on invalid 
> address that starts on the last 32-bit word of a page. "page" in this case is 
> referring to a page in SA's PageCache, which are always 4k in size. Note 
> findpc is frequently done using an invalid address. In this test case it is 
> being called on each x64 register, which could contain anything. findpc 
> relies on getting some sort of AddressException when this happens, and will 
> then say the address points to something that is unknown. However, in the 
> case of the address pointing to the last 32-bot word of a page, it results in 
> an ArrayIndexOutOfBoundsException when the page cache tries to read past the 
> end of the page. This is happening in Page.getLong(), which you can see in 
> the stack trace.
> 
> The main culprit here is some weakening of the alignment checks being done. 
> The alignment check should have resulted in an UnalignedAddressException long 
> before we ever got to Page.getLong(). However, we have the following 
> override, which is allowing the unaligned address to pass the alignment check.
> 
> 
>            public void checkAlignment(long address, long alignment) {
>              // Need to override default checkAlignment because we need to
>              // relax alignment constraints on Bsd/x86
>              if ( (address % alignment != 0)
>                 &&(alignment != 8 || address % 4 != 0)) {
>                 throw new UnalignedAddressException(
>                         "Trying to read at address: "
>                       + addressValueToString(address)
>                       + " with alignment: " + alignment,
>                         address);
>              }
>            }
>         };
> 
> 
> This allows a pointer to a 64-bit value to only be 32-bit aligned. But 
> there's more to it than that. I modified ClhsdbFindPC.java to fetch a tid (a 
> thread address), and did a findpc on the address OR'd with 0xffc. `findpc` 
> uses PointerFinder. This forced a read of a 64-bit value that starts in the 
> last 32-bits of a page. It passed on linux-x64 but failed on windowx-x64 in 
> the same manner described in the description of this CR. The difference 
> between the two implementations is that windows relies on the default 
> implementation of DebuggerBase.readCInteger() whereas linux has an override. 
> DebuggerBase.readCInteger() does the following:
> 
> 
>   public long readCInteger(long address, long numBytes, boolean isUnsigned) {
>     utils.checkAlignment(address, numBytes);
>     if (useFastAccessors) {
>       if (isUnsigned) {
>         switch((int) numBytes) {
>         case 1: return cache.getByte(address) & 0xFF;
>         case 2: return cache.getShort(address, bigEndian) & 0xFFFF;
>         case 4: return cache.getInt(address, bigEndian) & 0xFFFFFFFFL;
>         case 8: return cache.getLong(address, bigEndian);
> ...
> 
> 
> There is an alignment check here, but it is the "relaxed" override shown 
> above, which allows 64-bit addresses on 32-bit boundaries. The override in 
> LinuxDebuggerLocal is:
> 
> 
>     /** Need to override this to relax alignment checks on x86. */
>     public long readCInteger(long address, long numBytes, boolean isUnsigned)
>         throws UnmappedAddressException, UnalignedAddressException {
>         // Only slightly relaxed semantics -- this is a hack, but is
>         // necessary on x86 where it seems the compiler is
>         // putting some global 64-bit data on 32-bit boundaries
>         if (numBytes == 8) {
>             utils.checkAlignment(address, 4);
>         } else {
>             utils.checkAlignment(address, numBytes);
>         }
>         byte[] data = readBytes(address, numBytes);
>         return utils.dataToCInteger(data, isUnsigned);
> 
> 
> Although there is a relaxed alignment check here also, the code that reads 
> from the address does not assume all the bytes are on the same page. It uses 
> readBytes() intead of cache.getLong(), so it won't run into the PageCache 
> issue with reading a 64-bit value that starts in the last 32-bit word of a 
> page.
> 
> I think the introduction of these relaxed alignment checks has a muddled 
> history, probably made more muddled by ports cloning code that maybe wasn't 
> necessary. Probably also initial fixes (the relaxed alignment check) seemed 
> to work at first, but later the PageCache issue was discovered, leading to 
> readBytes() workaround in the LinuxDebuggerLocal.readCInteger() override, but 
> was not also done on other ports (so we got this CR for windows).
> 
> For 64-bit support it's clear this easing of the 64-bit alignment is not 
> needed, and getting rid of it would result in the proper 
> UnalignedAddressException being thrown. The question is whether it is still 
> needed for 32-bit x86, and if so is it needed on all ports.
> 
> I can't test linux-x86, so I can't tell if it still allows 64-bit values on 
> 32-bit aligned addresses, so for now I'll assume it does. So the approach 
> being taken is whenever an address of a 64-bit type points to the last 32-bit 
> word of a page, use readBytes() to get the 64-bit value one byte at a time. 
> It still uses the page cache in the end, but it doesn't try to get all 8 
> bytes from the same page. Note for 64-bit systems, the conditoin will never 
> arise because of the removal of the relaxed alignment check. Instead there 
> will be an UnalignedAddressException at an early point when the alignment 
> check is made.
> 
> One windfall of this change is now we always read 64-bit values from the page 
> cache in a way that is much more efficient than reading a byte at a time. 
> This has resulted in about a 25% performance improvement in a test I have 
> that does a heap dump.


Chris Plummer has updated the pull request with a new target base due to a 
merge or a rebase. The pull request now contains four commits:

 - Merge
 - Fix jcheck error
 - Undo some temp test changes.
 - Fix 64-bit alignment requirements

-------------

Changes: https://git.openjdk.org/jdk/pull/10090/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=10090&range=01
  Stats: 227 lines in 11 files changed: 52 ins; 154 del; 21 mod
  Patch: https://git.openjdk.org/jdk/pull/10090.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/10090/head:pull/10090

PR: https://git.openjdk.org/jdk/pull/10090

Re: RFR: 8292201: serviceability/sa/ClhsdbThreadContext.java fails with "'Thread "Common-Cleaner"' missing from stdout/stderr" [v2]

Reply via email to