On Thu, 12 Jan 2023 02:14:10 GMT, Archie L. Cobbs <d...@openjdk.org> wrote:
>>> >>> D'oh, you're right. But if you made `returnMe()` static or private then the >>> argument would still hold. >>> >>> > And, if the method is static, same story - you are passing ref3 somewhere >>> > else, and ref3 potentially contains this. >>> >>> Not true... static methods are safe to "invoke" because they can't be >>> overridden. When the analyzer "invokes" the method, it will see that all it >>> does with its parameter is return it. >>> >> >> So, for static methods, it could go down two ways: either we don't even look >> at referenced method bodies, give up and just declare "sorry, escaping". Or, >> if we look into method bodies, and see that the relationship between inner >> and outer parameter is as simple, it's just like assignment again: >> >> ref1 -> ref2 -> ref3 -> x -> ref4 >> >>> But I don't really have an analysis of all the trade-offs. Really, at a >>> certain point I stumbled on what I thought was a fairly straightforward way >>> to achieve a good false negative rate and a very low false positive rate, >>> without a lot of work. And frankly at that point I stopped caring about the >>> question you're asking, although I agree it's certainly interesting from an >>> academic point of view (I'm a practical person). >> >> The reason I'm asking these questions is that I'm trying to understand which >> ingredients went into the analysis and why, in order to try and build a >> mental problem of what the problem that needs to be solved is, what are the >> constraints, etc. I'm also a practical person :-) and I often find it easier >> to understand a piece of code if I know some of the reasoning that went >> behind it, or what's the behavior supposed to be. >> >> I do not know, off-hands, whether there is a simpler solution - I was mostly >> probing for war stories of the kind "I tried X and I got Y", and I apologize >> if that came off the wrong way. > >> So, for static methods, it could go down two ways: either we don't even look >> at referenced method bodies, give up and just declare "sorry, escaping". Or, >> if we look into method bodies, and see that the relationship between inner >> and outer parameter is as simple, it's just like assignment again: > > Right - and just to reconfirm, it's only the non-overridable methods in the > same compilation unit that are "invoked" (i.e., analyzed & tracked). Any > method that (might) exist in another compilation unit is off limits, so we > have to assume the worst case and declare a leak if we are passing it any > 'this' references. > >> in order to try and build a mental problem of what the problem that needs >> to be solved is > > Gotcha. > > Let's be clear about what exactly we're worrying about. It's this situation: > You have a class `B extends A`, where `B` and `A` are in different > compilation units, and during the `super()` call that constructor `B()` makes > to its superclass constructor `A()`, some code in class `B` gets executed > (really, it's some field in `B` getting accessed that actually matters) prior > to `A()` returning. > > So the basic goal is to analyze `A` constructors and watch 'this' references > until if/when they go to someplace where we can no longer watch them. At that > point we have to assume that some code in `B` might get invoked and so we > generate a warning. > > OK, so where can a 'this' reference go? > > Well, here are all of the possible places a Java reference can exist: > 1. On the Java stack > (a) The current 'this' instance > (b) A method parameter > (c) A local variable > (d) A temporary value that is part of the current expression being > evaluated > (e) The return value from a method that just returned > (f) A caught exception > 1. In a field of some object... > (a) A normal field > (b) An outer 'this' instance > (c) Other synthetic field (e.g., captured free variable) > 1. As an element in a reference array > 1. In native code as a native reference > > Those are the only possibilities AFAIK. > > So one way to locate yourself on the spectrum from "simple" to "complex" is > to answer this question: Which of those are you going to try to keep track > of, and for each one, how hard are you going to try? > > The `this-escape` analysis being proposed tracks 1(a-e) and 2(b) pretty > closely (it tries hard), and it adds a very course tracking of 2(a-c), and 3 > using the notion of indirect references (it doesn't try very hard). We do not > track 1(f) or 4. > > So to think about the overall problem, imagine how you might or might not > address all of those cases. > * On the Java stack > (a) The current 'this' instance > (b) A method parameter > (c) A local variable > (d) A temporary value that is part of the current expression being evaluated > (e) The return value from a method that just returned > (f) A caught exception > > * In a field of some object... > (a) A normal field > (b) An outer 'this' instance > (c) Other synthetic field (e.g., captured free variable) > > * As an element in a reference array > > * In native code as a native reference Thanks for the classification. This is helpful. I'm not sure what you mean by (1f). You mean `this` can be embedded in an exception being thrown? Is that different from (2)? Also, it seems to me that (3) is a special case of (2) - in the sense that you can imagine a reference array as a big object that has many fields as there are elements - so the same reasoning applies. When looking at (2d) and (2e) the fact that Javac's AST is not in SSA form means that, yes, we need to track _expressions_ not just variables (e.g. for chains of methods calls, ternary operators and such). ------------- PR: https://git.openjdk.org/jdk/pull/11874