I like the idea; it makes more sense to me than our current model (which feels 
more like a plain callback than a continuation to me). Some things that 
occurred to me when reading this:

- This seems like it'll be much simpler to check for invalid concurrent access 
to the same location (inout violations) in debug builds. (I don't think it's 
actually any more or less possible, but it is simpler.)

- This will look funny but work fine for multiple inout parameters: foo(&x, &y)

- OTOH, this does break up basic blocks in the caller context, making LLVM 
analysis less effective. Since the materializeForSet calls were already either 
inlined or opaque, though, we're probably fine there.

- Does this help us with the nested dictionary CoW problem? `foo["bar"]["baz"] 
+= 1`

Jordan



> On Oct 31, 2016, at 12:22, Joe Groff via swift-dev <swift-dev@swift.org> 
> wrote:
> 
> We currently abstract over mutable property accesses using what I’ll call a 
> continuation-based model–the materializeForSet accessor is called before an 
> inout access, and returns a continuation callback that must be called when 
> the inout access is completed. I think a nested-function-based model, if 
> designed correctly, has the potential to be more efficient and easier to 
> implement.
> 
> As background for anyone intimately familiar with the implementation, Swift’s 
> language model for mutable properties requires a reconciliation step after a 
> mutation occurs. For example, in an inout access like this:
> 
> foo(&x.y.z)
> the y or z property may be computed, requiring a getter call before the call 
> to foo and a setter call afterward, or either property may have willSet or 
> didSet observers that fire after foo finishes. If the base x is a class, 
> protocol, or generic value, the implementation of y can`t be statically 
> known, so we need an abstract interface for mutating the property, projecting 
> out the value at the start of the access and committing the mutated value 
> back at the end. Ideally, the interface should accommodate efficient access 
> to both stored and computed properties, avoiding unnecessary copying of 
> stored properties to avoid inducing CoW copies or, in the near future, 
> violating the constraints of move-only types or the borrow model. There are 
> two broad approaches to doing this:
> 
> Continuation-based access
> 
> The accessor that begins the formal access can return a continuation closure 
> to call when the access is completed. This is the approach Swift currently 
> uses with its materializeForSet accessor pattern. A property under this model 
> compiles to something like this C code:
> 
> struct X {
>   // getter for Y
>   Y (*getY)(const X *self);
>   // setter for Y
>   void (*setY)(X *self, Y newValue);
>   // materializeForSet for Y
>   struct MaterializeForSetReturn {
>     // The projected field
>     Y *y;
>     // The function that finishes the access, or null
>     void (*finish)(X *self, void *buffer);
>   }; 
>   struct MaterializeForSetReturn (*materalizeForSettingY)(X *self, void 
> *buffer);
> };
> 
> void callFooOnXDotYDotZ(X *x) {
>   // To compile:
>   //   foo(&x.y.z)
>   // 
>   // - call the materializeForSet accessor to begin the access to y, giving 
> it a buffer
>   //   to store a computed value or capture state to pass to the end of the 
> access
>   char alignas(Y) bufferY[sizeof(Y) + 3*sizeof(void*)];
>   auto materializedY = x->materializeForSettingY(x, bufferY);
>   // - call materializeForSet to begin access to z
>   char alignas(Z) bufferZ[sizeof(Z) + 3*sizeof(void*)];
>   autom materializedZ = 
> materializedY.y->materializeForSettingZ(materializedY.y, bufferZ);
>   // - perform the access
>   foo(materializedZ.z);
>   // - finish the accesses, if we were given a finishing callback
>   if (materializedZ.finish)
>     materializedZ.finish(materializedY.y, bufferZ);
>   if (materializedY.finish)
>     materializedY.finish(x, bufferY);
> }
> A stored property can be exposed through this interface by trivially 
> returning the projected address of the subobject, with no continuation:
> 
> struct MaterializeForSetReturn
> materializeForSettingStoredY(X *self, void *buffer) {
>   return (struct MaterializeForSetReturn){&self->y, NULL};
> }
> whereas a computed property can call the getter, store the computed value 
> into the buffer, and return a continuation function that calls the setter:
> 
> struct MaterializeForSetReturn
> materializeForSettingComputedY(X *self, void *buffer) {
>   Y oldValue = getY(self);
>   memcpy(buffer, &oldValue, sizeof(Y));
>   return (struct MaterializeForSetReturn){(Y *)buffer, 
> finishSettingComputedY};
> }
> void finishSettingComputedY(X *self, void *buffer) {
>   Y newValue;
>   memcpy(&newValue, buffer, sizeof(Y));
>   setY(self, newValue);
> }
> A benefit of this approach is that it maintains the integrity of the source 
> function in the compiled code. On the other hand, in order for the property 
> implementation to transfer state from the start to the completion of the 
> access, the caller must preallocate some stack space; if it’s too much, the 
> space is wasted, and if it’s not enough, the property implementation must use 
> malloc to get more space for itself. (It’s theoretically possible to avoid 
> this with some clever stack pointer accounting.) There’s also a code size 
> impact on the caller, which needs to bracket the inout access with a call on 
> each side. The underlying CPU has to execute three or five branches per 
> segment of the access path (calling and returning from materializeForSet, 
> testing for null, and potentially calling and returning from the continuation 
> if there is one).
> 
> Nested functions
> 
> Alternatively, we can implement the formal access pattern as a series of 
> nested functions. A property under this model compiles to something like this 
> C code:
> 
> struct X {
>   // getter for Y
>   Y (*getY)(const X *self);
>   // setter for Y
>   void (*setY)(X *self, Y newValue);
>   // mutator for Y
>   void *(*mutateY)(X *self, void *context, void *(*innerMutation)(Y *y, void 
> *context));
> };
> 
> void callFooOnXDotYDotZ(X *x) {
>   // To compile:
>   //   foo(&x.y)
>   // - call the mutate accessor for y, with the inner access as a
>   //   function whose pointer gets passed in
>   x->mutateY(x, NULL, callFooOnXDotY_inner_1);
> }
> void callFooOnXDotYDotZ_inner_1(Y *y, void *context) {
>   // - call the mutate accessor for z
>   y->mutateZ(y, context, callFooOnXDotY_inner_2);
> }
> void callFooOnXDotYDotZ_inner_2(Z *z, void *context) {
>   foo(z);
> }
> A stored property can implement mutate by projecting the physical address of 
> the subobject, tail-calling the subsequent inner function:
> 
> void *mutateStoredY(X *self, void *context, void *(*innerMutation)(Y *y, void 
> *context)) {
>   /*tail*/ return innerMutation(&self->y, context);
> }
> and a computed property can bracket the inner mutation in getter and setter 
> calls:
> 
> void *mutateComputedY(X *self, void *context, void *(*innerMutation)(Y *y, 
> void *context)) {
>   Y value = getY(self);
>   void *result = innerMutation(&value, context);
>   setY(self, value);
>   return result;
> }
> This makes things easier for the property implementation, which can easily 
> save as much state as it needs to on the stack instead of relying on a 
> preallocated buffer handed down from the caller. There’s one fewer branch 
> necessary per segment of the access path compared to the continuation-based 
> model, either three if the accessor has no reconciliation necessary and can 
> tail-call the inner function, or four if it does a normal call. On the other 
> hand, the caller function has to be broken up into many smaller subfunctions, 
> potentially one for every segment of an access path.
> 
> Letting nested functions walk the access path
> 
> We can reduce the number of nested functions that are needed for a nested 
> access path by designing a convention that allows each mutator to directly 
> call the mutator for the next segment of access. For example, the caller 
> could construct the entire access path as an array of pointers to mutation 
> functions:
> 
> typedef void *(*MutatorFunction)(void *base, MutatorFunction *accessPath, 
> void *context);
> void callFooOnXDotYDotZDotW(X *x) {
>   // foo(&x.y.z.w)
>   MutatorFunction accessPath[] = {
>     (MutatorFunction)mutateZ,
>     (MutatorFunction)mutateW,
>     (MutatorFunction)callFooOnXDotYDotZDotW_inner
>   };
>   mutateY(x, accessPath, NULL);
> }
> Each property accessor function then loads the next step of the access out of 
> the path, and advances the pointer forward for the nested access:
> 
> void *mutateY(X *self, MutatorFunction *accessPath, void *context) {
>   // Project out y
>   Y *y = self->y;
>   // Call the next step of the access path, passing the remainder of the 
> access path
>   return (*accessPath)(y, accessPath + 1, context);
> }
> And the inner function at the end of the access path ignores the access path 
> parameter and performs the access:
> 
> void *callFooOnXDotYDotZDotW_inner(W *w, MutatorFunction *_ignored, void 
> *context) {
>   foo(w);
> }
> In this model, only one inner function needs to be broken out of the caller 
> for the innermost inout access, at least for simple property projections. 
> There are only two branches needed per segment, or one for segments that can 
> tail-call the inner access. Parameterized segments such as subscripts can be 
> supported by storing the parameters in the access path buffer as well, and 
> having the subscript accessor advance past the parameters when it consumes 
> them. For example:
> 
> void callFooOnXDotYSubIDotZ(X *x, intptr_t i) {
>   // foo(&x.y[i].z)
>   uintptr_t accessPath[] = {
>     (uintptr_t)mutateYSubscript,
>     (uintptr_t)i,
>     (uintptr_t)mutateZ,
>     (uintptr_t)callFooOnXDotYSubIDotZ_inner
>   };
>   
>   mutateY(x, accessPath, NULL);
> }
> 
> void *mutateYSubscript(Y *y, uintptr_t *accessPath, void *context) {
>   // Get the subscript index out of the access path
>   intptr_t i = (intptr_t)*accessPath++;
>   // Get the next step of the access out of the access path
>   MutatorFunction next = (MutatorFunction)*accessPath++;
>   
>   return next(&y->elements[i], accessPath, context);
> }
> On the other hand, this does push the inner access functions and subscript 
> arguments onto the stack, which is a small amount of overhead. Furthermore, 
> the caller function has to push any state it needs to pass from outside the 
> access to inside on the stack as well, if it isn’t able to cram it otherwise 
> into a single context pointer argument.
> 
> Machine-level calling convention tricks
> 
> To be able to feed a return value from the inner access back to the outer 
> caller, nested mutators would need to preserve all of the return registers 
> potentially used by the inner access on the way out of the access. To 
> minimize the impact on the outer caller, the convention for mutators could 
> also specify that they need to preserve the contents of callee-save registers 
> when they call down to nested accesses. This should allow as much state to be 
> kept in registers between the outer caller and inner access as if they were 
> naturally in the same function, with at most a function call in between.
> 
> With these sorts of optimizations, I think the nested function approach has 
> the potential to be competitive with, or even more efficient than, our 
> current continuation-based materializeForSet design.
> 
> -Joe
> _______________________________________________
> swift-dev mailing list
> swift-dev@swift.org
> https://lists.swift.org/mailman/listinfo/swift-dev

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

Reply via email to