Re: [swift-dev] Continuation-based versus nested-function-based materializeForSet

Slava Pestov via swift-dev Tue, 01 Nov 2016 21:24:07 -0700

> On Nov 1, 2016, at 11:00 AM, Jordan Rose via swift-dev <swift-dev@swift.org> 
> wrote:
> 
> - Does this help us with the nested dictionary CoW problem? 
> `foo["bar"]["baz"] += 1`


My understanding is that this problem arises because we don’t have ‘optional 
addressors’. A dictionary lookup might return nil. If addressors had a way to 
return an optional wrapping a pointer, and optional evaluation supported this, 
we could do in-place updates of this kind. For Apple people: I have a radar 
that I think describes this problem (rdar://17509619).

Slava

> 
> Jordan
> 
> 
> 
>> On Oct 31, 2016, at 12:22, Joe Groff via swift-dev <swift-dev@swift.org 
>> <mailto:swift-dev@swift.org>> wrote:
>> 
>> We currently abstract over mutable property accesses using what I’ll call a 
>> continuation-based model–the materializeForSet accessor is called before an 
>> inout access, and returns a continuation callback that must be called when 
>> the inout access is completed. I think a nested-function-based model, if 
>> designed correctly, has the potential to be more efficient and easier to 
>> implement.
>> 
>> As background for anyone intimately familiar with the implementation, 
>> Swift’s language model for mutable properties requires a reconciliation step 
>> after a mutation occurs. For example, in an inout access like this:
>> 
>> foo(&x.y.z)
>> the y or z property may be computed, requiring a getter call before the call 
>> to foo and a setter call afterward, or either property may have willSet or 
>> didSet observers that fire after foo finishes. If the base x is a class, 
>> protocol, or generic value, the implementation of y can`t be statically 
>> known, so we need an abstract interface for mutating the property, 
>> projecting out the value at the start of the access and committing the 
>> mutated value back at the end. Ideally, the interface should accommodate 
>> efficient access to both stored and computed properties, avoiding 
>> unnecessary copying of stored properties to avoid inducing CoW copies or, in 
>> the near future, violating the constraints of move-only types or the borrow 
>> model. There are two broad approaches to doing this:
>> 
>> Continuation-based access
>> 
>> The accessor that begins the formal access can return a continuation closure 
>> to call when the access is completed. This is the approach Swift currently 
>> uses with its materializeForSet accessor pattern. A property under this 
>> model compiles to something like this C code:
>> 
>> struct X {
>>   // getter for Y
>>   Y (*getY)(const X *self);
>>   // setter for Y
>>   void (*setY)(X *self, Y newValue);
>>   // materializeForSet for Y
>>   struct MaterializeForSetReturn {
>>     // The projected field
>>     Y *y;
>>     // The function that finishes the access, or null
>>     void (*finish)(X *self, void *buffer);
>>   }; 
>>   struct MaterializeForSetReturn (*materalizeForSettingY)(X *self, void 
>> *buffer);
>> };
>> 
>> void callFooOnXDotYDotZ(X *x) {
>>   // To compile:
>>   //   foo(&x.y.z)
>>   // 
>>   // - call the materializeForSet accessor to begin the access to y, giving 
>> it a buffer
>>   //   to store a computed value or capture state to pass to the end of the 
>> access
>>   char alignas(Y) bufferY[sizeof(Y) + 3*sizeof(void*)];
>>   auto materializedY = x->materializeForSettingY(x, bufferY);
>>   // - call materializeForSet to begin access to z
>>   char alignas(Z) bufferZ[sizeof(Z) + 3*sizeof(void*)];
>>   autom materializedZ = 
>> materializedY.y->materializeForSettingZ(materializedY.y, bufferZ);
>>   // - perform the access
>>   foo(materializedZ.z);
>>   // - finish the accesses, if we were given a finishing callback
>>   if (materializedZ.finish)
>>     materializedZ.finish(materializedY.y, bufferZ);
>>   if (materializedY.finish)
>>     materializedY.finish(x, bufferY);
>> }
>> A stored property can be exposed through this interface by trivially 
>> returning the projected address of the subobject, with no continuation:
>> 
>> struct MaterializeForSetReturn
>> materializeForSettingStoredY(X *self, void *buffer) {
>>   return (struct MaterializeForSetReturn){&self->y, NULL};
>> }
>> whereas a computed property can call the getter, store the computed value 
>> into the buffer, and return a continuation function that calls the setter:
>> 
>> struct MaterializeForSetReturn
>> materializeForSettingComputedY(X *self, void *buffer) {
>>   Y oldValue = getY(self);
>>   memcpy(buffer, &oldValue, sizeof(Y));
>>   return (struct MaterializeForSetReturn){(Y *)buffer, 
>> finishSettingComputedY};
>> }
>> void finishSettingComputedY(X *self, void *buffer) {
>>   Y newValue;
>>   memcpy(&newValue, buffer, sizeof(Y));
>>   setY(self, newValue);
>> }
>> A benefit of this approach is that it maintains the integrity of the source 
>> function in the compiled code. On the other hand, in order for the property 
>> implementation to transfer state from the start to the completion of the 
>> access, the caller must preallocate some stack space; if it’s too much, the 
>> space is wasted, and if it’s not enough, the property implementation must 
>> use malloc to get more space for itself. (It’s theoretically possible to 
>> avoid this with some clever stack pointer accounting.) There’s also a code 
>> size impact on the caller, which needs to bracket the inout access with a 
>> call on each side. The underlying CPU has to execute three or five branches 
>> per segment of the access path (calling and returning from 
>> materializeForSet, testing for null, and potentially calling and returning 
>> from the continuation if there is one).
>> 
>> Nested functions
>> 
>> Alternatively, we can implement the formal access pattern as a series of 
>> nested functions. A property under this model compiles to something like 
>> this C code:
>> 
>> struct X {
>>   // getter for Y
>>   Y (*getY)(const X *self);
>>   // setter for Y
>>   void (*setY)(X *self, Y newValue);
>>   // mutator for Y
>>   void *(*mutateY)(X *self, void *context, void *(*innerMutation)(Y *y, void 
>> *context));
>> };
>> 
>> void callFooOnXDotYDotZ(X *x) {
>>   // To compile:
>>   //   foo(&x.y)
>>   // - call the mutate accessor for y, with the inner access as a
>>   //   function whose pointer gets passed in
>>   x->mutateY(x, NULL, callFooOnXDotY_inner_1);
>> }
>> void callFooOnXDotYDotZ_inner_1(Y *y, void *context) {
>>   // - call the mutate accessor for z
>>   y->mutateZ(y, context, callFooOnXDotY_inner_2);
>> }
>> void callFooOnXDotYDotZ_inner_2(Z *z, void *context) {
>>   foo(z);
>> }
>> A stored property can implement mutate by projecting the physical address of 
>> the subobject, tail-calling the subsequent inner function:
>> 
>> void *mutateStoredY(X *self, void *context, void *(*innerMutation)(Y *y, 
>> void *context)) {
>>   /*tail*/ return innerMutation(&self->y, context);
>> }
>> and a computed property can bracket the inner mutation in getter and setter 
>> calls:
>> 
>> void *mutateComputedY(X *self, void *context, void *(*innerMutation)(Y *y, 
>> void *context)) {
>>   Y value = getY(self);
>>   void *result = innerMutation(&value, context);
>>   setY(self, value);
>>   return result;
>> }
>> This makes things easier for the property implementation, which can easily 
>> save as much state as it needs to on the stack instead of relying on a 
>> preallocated buffer handed down from the caller. There’s one fewer branch 
>> necessary per segment of the access path compared to the continuation-based 
>> model, either three if the accessor has no reconciliation necessary and can 
>> tail-call the inner function, or four if it does a normal call. On the other 
>> hand, the caller function has to be broken up into many smaller 
>> subfunctions, potentially one for every segment of an access path.
>> 
>> Letting nested functions walk the access path
>> 
>> We can reduce the number of nested functions that are needed for a nested 
>> access path by designing a convention that allows each mutator to directly 
>> call the mutator for the next segment of access. For example, the caller 
>> could construct the entire access path as an array of pointers to mutation 
>> functions:
>> 
>> typedef void *(*MutatorFunction)(void *base, MutatorFunction *accessPath, 
>> void *context);
>> void callFooOnXDotYDotZDotW(X *x) {
>>   // foo(&x.y.z.w)
>>   MutatorFunction accessPath[] = {
>>     (MutatorFunction)mutateZ,
>>     (MutatorFunction)mutateW,
>>     (MutatorFunction)callFooOnXDotYDotZDotW_inner
>>   };
>>   mutateY(x, accessPath, NULL);
>> }
>> Each property accessor function then loads the next step of the access out 
>> of the path, and advances the pointer forward for the nested access:
>> 
>> void *mutateY(X *self, MutatorFunction *accessPath, void *context) {
>>   // Project out y
>>   Y *y = self->y;
>>   // Call the next step of the access path, passing the remainder of the 
>> access path
>>   return (*accessPath)(y, accessPath + 1, context);
>> }
>> And the inner function at the end of the access path ignores the access path 
>> parameter and performs the access:
>> 
>> void *callFooOnXDotYDotZDotW_inner(W *w, MutatorFunction *_ignored, void 
>> *context) {
>>   foo(w);
>> }
>> In this model, only one inner function needs to be broken out of the caller 
>> for the innermost inout access, at least for simple property projections. 
>> There are only two branches needed per segment, or one for segments that can 
>> tail-call the inner access. Parameterized segments such as subscripts can be 
>> supported by storing the parameters in the access path buffer as well, and 
>> having the subscript accessor advance past the parameters when it consumes 
>> them. For example:
>> 
>> void callFooOnXDotYSubIDotZ(X *x, intptr_t i) {
>>   // foo(&x.y[i].z)
>>   uintptr_t accessPath[] = {
>>     (uintptr_t)mutateYSubscript,
>>     (uintptr_t)i,
>>     (uintptr_t)mutateZ,
>>     (uintptr_t)callFooOnXDotYSubIDotZ_inner
>>   };
>>   
>>   mutateY(x, accessPath, NULL);
>> }
>> 
>> void *mutateYSubscript(Y *y, uintptr_t *accessPath, void *context) {
>>   // Get the subscript index out of the access path
>>   intptr_t i = (intptr_t)*accessPath++;
>>   // Get the next step of the access out of the access path
>>   MutatorFunction next = (MutatorFunction)*accessPath++;
>>   
>>   return next(&y->elements[i], accessPath, context);
>> }
>> On the other hand, this does push the inner access functions and subscript 
>> arguments onto the stack, which is a small amount of overhead. Furthermore, 
>> the caller function has to push any state it needs to pass from outside the 
>> access to inside on the stack as well, if it isn’t able to cram it otherwise 
>> into a single context pointer argument.
>> 
>> Machine-level calling convention tricks
>> 
>> To be able to feed a return value from the inner access back to the outer 
>> caller, nested mutators would need to preserve all of the return registers 
>> potentially used by the inner access on the way out of the access. To 
>> minimize the impact on the outer caller, the convention for mutators could 
>> also specify that they need to preserve the contents of callee-save 
>> registers when they call down to nested accesses. This should allow as much 
>> state to be kept in registers between the outer caller and inner access as 
>> if they were naturally in the same function, with at most a function call in 
>> between.
>> 
>> With these sorts of optimizations, I think the nested function approach has 
>> the potential to be competitive with, or even more efficient than, our 
>> current continuation-based materializeForSet design.
>> 
>> -Joe
>> _______________________________________________
>> swift-dev mailing list
>> swift-dev@swift.org <mailto:swift-dev@swift.org>
>> https://lists.swift.org/mailman/listinfo/swift-dev
> 
> _______________________________________________
> swift-dev mailing list
> swift-dev@swift.org
> https://lists.swift.org/mailman/listinfo/swift-dev

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

Re: [swift-dev] Continuation-based versus nested-function-based materializeForSet

Reply via email to