On Wed, 17 Sep 2008, Mindaugas Kavaliauskas wrote:

Hi,

> I've spent some time (well not much, an hour...) to find out how tls works 
> on Windows.
> The original idea is based on undocumented (but de facto) fs segment. In 
> Win32 both 9x and NT fs segment register points to Win32 Thread information 
> block (TIB). Structure is defined at 
> http://en.wikipedia.org/wiki/Win32_Thread_Information_Block
> TIB could be used to access some thread specific data (including TLS 
> values) as an alternative to Win32 API calls. For example a single asm 
> instruction
>    mov eax,fs:[24h]
> could be used instead of GetCurrentThreadID Win32 API call.
> To enable TIB extensions (or by other reasons) TIB should be not accessed 
> directly from fs segment, but address of TIB is obtained from using 
> fs:[18h], and this addresss is used to access TIB using common data segment 
> register ds. So, GetCurentThreadID in asm could be implemented:
>    mov eax,fs:[18h]
>    mov eax,[eax+24h]
>    ret
> Actually it is exact code for GetCurentThreadID from kernel32.dll.
> BCC's TLS is implented:
>   int __thread  a;
> would be comiled to:
>   call ___GetTls
>   mov  eax,[eax+some_offset]
> The function __GetTls itself is:
>   mov  eax,[tls_index]
>   mov  edx,fs:[2Ch]
>   mov  eax,[edx+eax*4]
>   ret
> Win32 API's TlsGetValue() is a little less optimal. It creates stack frame 
> (push bp, etc.), checks for tls_index value to not exceed maximul allowed 
> value, and clears last error value. This adds 7 more CPU instructions. 

Thank you very much for this information. I didn't know details of MS-Win
TLS implementation.

> These few additional instructions should not cause a big overhead, but my 
> test shows a big speed difference. speedtst.prg shows 100 seconds with 
> HB_USE_TLS and 170 seconds without HB_USE_TLS (commented out HB_USE_TLS 
> inside hbthread.h). The reason of such big overhead is not clear enough for 

It's expected. retrieving hb_stack pointer is critical for HVM speed.
For each PCODE it's accessed many times. Just compare speed difference
when you compiled ST HVM with and without HB_NO_DEFAULT_STACK_MACROS.
And here the difference is much smaller.
I guess that if you inline this function then speed of MT Harbour-BCC
programs will be greatly improved. If I understand the above it should
look like (meta code which should be adopted to BCC so it can be inlined):

   static __inline__ PHB_STACK hb_stack_ptr( void )
   {
      __asm__ (
         mov   eax,hb_stack_key
         mov   edx,fs:[2Ch]
         mov   eax,[edx+eax*4]
      );
   }

If we also want to use HB_STACK_MACROS then hb_stack_key in estack.h
should be changed to public variable. 

> me. Win32 API calls during application load process are redirected to dlls 
> via additional jmp instruction, but this does not explain such big 
> overhead.

It's the most often used peace of code by HVM which is critical for
the performance. Just check how much hb_stack*() functions is used in
hvm.c and then check how many times each of them access hb_stack
variable. Now you can also imagine why passing hb_stack pointer
is important. In practice each HVM function need it.
We can reduce the problem by introducing meta variables with
HB_STACK pointer initialized at the beginning of each function.
In xHarbour it's done by HB_THREAD_STUB macro. Using this trick
only inside estack.c should give noticeable performance improvement
when HB_STACK_MACROS are disables (default when HB_USE_TLS is
not enabled). This I'll probably implement ASAP but it can be also
used inside hvm.c code with HB_STACK_MACROS though here it will
make code more ugly. If the results from estack.c modification
will be sufficient and the code will not look ugly then maybe
I'll make it. But if you can create inlined version of above
code for BCC (don't ask me how, I stop seriously using BCC in
DOS days and here some tricks will be necessary) then maybe it
will give enough speed improvement for this compiler.

best regards,
Przemek
_______________________________________________
Harbour mailing list
Harbour@harbour-project.org
http://lists.harbour-project.org/mailman/listinfo/harbour

Reply via email to