HI, On Mon, 2012-07-23 at 16:58 +0200, OBones wrote: > Jonas Maebe wrote: > > On 23 Jul 2012, at 10:58, OBones wrote: > > > >> leledumbo wrote: > >>> I look at the generated code and in the direct one there's additional > >>> overhead of decrementing the reference counter on each iteration. > >> I see it too now (I forgot about the -a option). > >> I can understand why there is a call to the decrementer outside the loop > >> when using the variable, but I don't understand why it pushes it > >> completely out of the loop. I mean, isn't there a reference count problem > >> here? > > Reference counted data types are returned by reference in a location passed > > to the function by the caller. The compiler here passes the address of S to > > the function, so that when assigning something to the function result > > inside the function, S' reference count gets decreased. > > > > The fact that when the result is returned in a temp, this temp also gets > > finalized on the caller side before the call is just a code generator > > inefficiency. I've changed that in trunk. > Thanks for the explanation and the fix. > > Because the finalization happened too early, those memory allocations > and deallocations were very costly and I found the direct code to be 30 > times slower. > Doing it this way has the advantage of being inherently thread safe, but > considering the performance penalty, I have moved to doing it this way: > > var > Buffer: array [0..1024*1024-1] of WideChar; > BufferLock: NativeInt; > > function GetSomeString(Index: Integer): UnicodeString; > begin > while BufferLock > 0 do > Sleep(1); > > InterlockedIncrement(BufferLock); > try > CallToAnAPIThatWritesBackAWideString(@Buffer[0], Length(Buffer) - 1); > Result := PWideChar(@Buffer[0]); > finally > InterlockedDecrement(BufferLock); > end; > end; > > This works, with an equivalent penalty on both methods and is threadsafe > (I believe).
This code is not thread safe at all. A thread switch after the while loop and before the increment will not prevent progress on other threads, so multiple threads can enter the "critical section". Use EnterCriticalSection/LeaveCriticalSection from the RTL if you need that. If you could somewhat decrease the size of your buffer to something reasonable - do you really expect to translate a string with 1M chars? - use the stack. Actually even a 2M data object on the stack might be reasonable. Further, in any case you'll need some code that works in "all" cases anyway - what if your string does exceed 1M chars after all? I mean, on a 1M string, heap allocation will probably be the least of your performance worries. This will completely avoid use of any synchronization primitive. E.g. function GetSomeString(index : integer) : UnicodeString; const bufsize = 1024; // as you wish var buffer : array[0..bufsize-1] of WideChar; pbuffer : PWideChar; islongstring : boolean; lengthofstring : integer; begin lengthofstring := length(stringfromindex(integer)); islongstring := lengthofstring > bufsize; if (islongstring) then pbuffer := getmem(lengthofstring * sizeof(WideChar)); else pbuffer := @buffer; CallToAnAPIThatWritesBackAWideString(stringfromindex(integer), pbuffer, lengthofstring); if (islongstring) then freemem(pbuffer); result := pbuffer; end; You may want to improve the code further by e.g. checking whether the memory allocation has been successful or not; or factor out the case when the string is big into a separate method to improve readability. Thomas _______________________________________________ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal