Re: [swift-dev] Representing "address-only" values in SIL

John McCall via swift-dev Mon, 03 Oct 2016 14:03:01 -0700

> On Oct 3, 2016, at 1:37 PM, Andrew Trick <atr...@apple.com> wrote:
>> On Oct 3, 2016, at 12:10 PM, Alexis via swift-dev <swift-dev@swift.org 
>> <mailto:swift-dev@swift.org>> wrote:
>> 
>> When I first started reading this proposal, my primary objection was going 
>> to be that SSA doesn’t seem to really jive well with the idea of values 
>> becoming (in)valid at some point in the future (due to moves). You can see 
>> this in the definite-init pass, which it works entirely with addresses to 
>> handle the idea of a value which becomes assigned “eventually”. But if SIL’s 
>> notion of SSA can be extended to handle these problems, this sounds great!
>> 
>> It’s not clear to me how this would work though. How does an optimization 
>> pass reason about an SSA register becoming invalid because it was moved out 
>> of? Or rather: in what ways do the interesting properties of SSA survive 
>> passes needing to handle this? Is this a standard extension that’s been 
>> researched/implemented before?
> 
> I think that’s why John claimed that we need to enforce “pseudo-linear” SIL 
> values types. Moving out of an SSA value must be the last use of that value. 
> SIL will enforce this single-consumer property throughout.


Precisely.  SSA is already subject to a number of non-trivial formation rules, 
such as the rule that you can only refer to dominating values.  And SIL already 
adds a number of rules to that, such as the rule that alloc_stacks have to be 
statically nested.  I don't think pseudo-linearity is a problematic extension 
to that set of rules.

It is, of course, an expressivity limitation on SIL — there will always be 
cases where you could *prove* correctness but it's not *structurally valid*.  
But that's true of the SSA dominance restriction as well.

John.

> 
> -Andy
> 
>> 
>>> On Oct 1, 2016, at 4:32 AM, John McCall via swift-dev <swift-dev@swift.org 
>>> <mailto:swift-dev@swift.org>> wrote:
>>> 
>>> Andy Trick and I had this conversation Thursday, and I thought I'd capture 
>>> it here.
>>> 
>>> The Problem
>>> 
>>> It's a longstanding complaint that SIL uses drastically different code 
>>> patterns for the same sequence of operations based on whether the types of 
>>> the values involved are loadable or "address-only".  For example, suppose I 
>>> have Swift code like this:
>>>   let x, y : T
>>>   let z = x + y
>>> 
>>> If T is a loadable type, this will generate SIL somewhat like this:
>>>  // %x and %y are values of type $T
>>> %lhs = copy_value %x
>>> %rhs = copy_value %y
>>>  %operator = function_ref T.+
>>>  %result = apply %operator(%lhs, %rhs)
>>>  %z = %result
>>> 
>>> (copy_value doesn't currently exist, but it's easier to understand, and as 
>>> it happens we're thinking of adding it back.)
>>> 
>>> If T is an address-only type, this will generate SIL somewhat like this:
>>>   // %x and %y are values of type $*T
>>>   %z = alloc_stack $T
>>>   %lhs = alloc_stack $T
>>>   copy_addr %x to [initialization] %lhs
>>>   %rhs = alloc_stack $T
>>>   copy_addr %y to [initialization] %rhs
>>>   %operator = function_ref T.+
>>>   apply %operator(%z, %lhs, %rhs)
>>>   dealloc_stack %rhs
>>>   dealloc_stack %lhs
>>> 
>>> Notably, we're explicitly modeling the memory in which values are stored, 
>>> which is both very verbose and — more importantly — loses any interesting 
>>> SSA properties for tracking actual values around.  And this has a bunch of 
>>> secondary effects where various high-level operations like dynamic casts 
>>> end up needing two different instructions based on whether the value is 
>>> stored in memory.  This is pretty dumb, and it's a major contributor to the 
>>> reality that generic code is currently very poorly optimized.
>>> 
>>> It does, however, have some significant advantages: since the memory 
>>> allocation is explicit, it's quite natural to express optimizations that 
>>> e.g. hoist or sink those allocations, and the next level of lowering 
>>> (IRGen) can be very simple.
>>> 
>>> Addresses and address-only types
>>> 
>>> Swift is an imperative language, and imperative languages make formal 
>>> memory locations part of the high-level semantics.  DI and SSA formation 
>>> allow us to eliminate formal memory locations for most local variables, but 
>>> class properties, global variables, escaped local variables, and pointers 
>>> are all fundamentally un-SSA-able.  Therefore we will always have some 
>>> concept of an address.
>>> 
>>> But most values of address-only type aren't really being stored in that 
>>> sort of formal memory location; we're just representing them that way in 
>>> SIL.  Why do we do this?  What is it about a type that makes it 
>>> address-only?
>>> 
>>> Well, some types are inherently "memory-pinned": something about their 
>>> representation only makes sense, or is only implementable, if the value is 
>>> always kept in memory:
>>>   - The representation of the value may involve interior pointers, as with 
>>> LLVM's SmallVector.  This isn't currently a thing in Swift, but it's a 
>>> possibility someday with the import of non-POD C++ types.
>>>   - The address of the value may need to be registered elsewhere, as with 
>>> weak references.
>>>   - The value may allow internal mutation even from a shared reference, 
>>> like a C++ class with a mutable field or a Rust atomic type; you can see 
>>> weak references as analogous to this.
>>> Such types are necessarily address-only at a low level.
>>> 
>>> Other types are address-only by abstraction: their representation isn't 
>>> (fully) known, and so they have to be kept in memory (1) in case that 
>>> representation includes another address-only value and (2) because there's 
>>> no way to store a unbounded amount of data except in memory anyway.
>>> 
>>> But this sense of address-only types that we're describing is really a 
>>> property of the final generated code.  It's necessary for LLVM IR 
>>> generation to know about it, but it's not obviously necessary for SIL to 
>>> know about it beyond the implications of the inherently memory-pinned cases 
>>> above:
>>>   - it is not generally safe to assume that the stored properties of a 
>>> non-POD C++ type remain invariant across moves
>>>   - weak references *do* represent the same value across moves, but of 
>>> course that value can change dynamically at any time anyway, per the rules 
>>> of weak references
>>>   - mutable fields can be modified even by a shared borrowed reference, and 
>>> so (if they are modeled at all in SIL at all, rather than just leaving the 
>>> type opaque) there must be some way to project a mutable address from a 
>>> shared borrow and so on.
>>> 
>>> Our current representation of address-only types arises directly from the 
>>> low-level operations that are required, which does admit some interesting 
>>> optimizations on its own, but the disadvantages of having to support wildly 
>>> divergent code paths and completely give up SSA use/def chains are 
>>> crippling.  What would SIL look like if we abandoned this entirely?
>>> 
>>> All types as SIL scalars
>>> 
>>> The fundamental issue here is that IR-level lowering does need to place 
>>> memory-pinned values into memory.  Suppose we take a value, use it as an 
>>> enum case payload, inject that enum into an Any, and pass that to a 
>>> function.  In a future world where we consistently SSA all local values, 
>>> this ideally looks something like this:
>>> 
>>>  // %value is a $T
>>>  %enum = enum #MyEnum.foo, %value : $T
>>>  %any = existential $Any, %enum
>>>  %fn = function_ref @bar
>>>   apply %fn(%any)
>>> 
>>> If all of these types were loadable, lowering this to IR doesn't impose any 
>>> abstraction costs, because we can just manipulate it as a bit-pattern in 
>>> memory, which LLVM (really, any reasonable compiler infrastructure) should 
>>> be quite good at analyzing and optimizing.  But if all of these types are 
>>> address-only, that's not obviously true.  Let's look at the details of 
>>> lowering to memory.
>>> 
>>> Because all types are movable — even in a world with Rust-style ownership — 
>>> there's a fairly straightforward lowering that works for all possible 
>>> functions: every address-only SIL value gets its own allocation which is 
>>> deallocated along the dominance frontier of the value.  Indirect arguments 
>>> use the passed-in buffer.  Basic block arguments are copied from the 
>>> allocation for the branch argument value to the allocation for the 
>>> corresponding destination block parameter.  Indirect return values are 
>>> copied from the allocation for the return value to the pointer passed in.
>>> 
>>> This admits some obvious optimizations.  If "owned" SIL values are 
>>> pseudo-linear — i.e. along every path from their definition point, it is 
>>> statically evident that they are (optionally) borrowed multiple times, 
>>> consumed exactly once, and then never used again — then the copies can 
>>> instead be moves.  Standard register-allocation algorithms can recognize 
>>> when basic block parameters can use the same location as their arguments.  
>>> SIL only permits a single return instruction, so there's no reason not to 
>>> allocate returned values directly into the return slot.  These 
>>> optimizations will tend to eliminate a lot of moves.
>>> 
>>> However, there's another problem, which is injections.  Consider the 
>>> example above.  %any is an opaque existential, and initializing it formally 
>>> involves allocating a buffer within the existential and moving the argument 
>>> into that buffer.  Ideally, we would allocate that buffer first and then 
>>> simply allocate %enum in-place into that buffer.  In this simple example, 
>>> that's easy.  But if the control flow were more complex, detecting that 
>>> this is possible becomes significantly more challenging, as does ensuring 
>>> that the buffer is properly cleaned up along all paths.  For example, 
>>> suppose that %value were computed by calling a throwing function:
>>> 
>>>   try_apply %someFunction() normal %cont, unwind %handler
>>> cont(%value: $T):
>>>   %enum = enum #MyEnum.foo, %value : $T
>>>   %any = existential $Any, %enum
>>>   %fn = function_ref @bar
>>>   apply %fn(%any)
>>> handler(%error: $Error):
>>>   throw $error
>>> 
>>> Naive allocation here is going to introduce a lot of moves.  Optimally, we 
>>> would receive the return value from %someFunction directly in the payload 
>>> of %enum, which we want to build directly into the allocated existential 
>>> buffer of %any.  But to do this, we actually need to allocate that 
>>> existential buffer before executing the try_apply; and if the try_apply 
>>> throws, we need to deallocate that existential buffer in the handler block. 
>>>  The need to retroactively insert this kind of clean-up code adds a lot of 
>>> complexity to this allocation approach.  Moreover, it's quite possible that 
>>> complex intermediate control — for example, if there's a loop somewhere 
>>> between the definition of a value and its consuming use — will tend to 
>>> block this kind of analysis and cause more unnecessary moves.
>>> 
>>> In contrast, the current SIL-generation scheme tends to be aware of at 
>>> least the immediate local uses of values and therefore emits a lot of these 
>>> kinds of initialization "in-place" already.  But that said, it does proceed 
>>> from a simple recursive examination of the AST, which means it will miss 
>>> examples that are just slightly more opaque, like binding a return value as 
>>> a let and only then returning it.  That kind of thing is currently left to 
>>> the optimizer for no real good reason.
>>> 
>>> Summary
>>> 
>>> Overall, I think representing local address-only values as SIL scalars with 
>>> full SSA use/def chains is really promising, and I do think we can write an 
>>> allocation pass that does an acceptable job eliminating unnecessary moves.  
>>> In order to actually do this, though, I think we need two things:
>>>   - We need SIL to be "pseudo-linear" as discussed above.  We really don't 
>>> want the allocation pass to have to worry about keeping values alive past 
>>> consumptions, and thus potentially having to insert copies instead of moves.
>>>   - We need the allocation pass to handle exits before initialization (like 
>>> with try_apply) and other sorts of interfering control flow.  It will not 
>>> be acceptable for this to be a block-local analysis with only a trivial 
>>> amount of cross-block argument merging.
>>> 
>>> John.
>>> _______________________________________________
>>> swift-dev mailing list
>>> swift-dev@swift.org <mailto:swift-dev@swift.org>
>>> https://lists.swift.org/mailman/listinfo/swift-dev 
>>> <https://lists.swift.org/mailman/listinfo/swift-dev>
>> 
>> _______________________________________________
>> swift-dev mailing list
>> swift-dev@swift.org <mailto:swift-dev@swift.org>
>> https://lists.swift.org/mailman/listinfo/swift-dev
>

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

Re: [swift-dev] Representing "address-only" values in SIL

Reply via email to