On Thu, 25 Sep 2008, RoddGraham wrote: Hi Rodd,
Nice to see your messages again. > As a MT Xbase++ user, I have knowledge of its features and shortcommings. > 1) Xbase++ is MT, but only reliable on single cpu core within the process. > By default, the Xbase++ runtime sets single core affinity with the OS. In > fact you must make an undocumented setting to allow Xbase++ to use multiple > cores after which it will lock (live or dead I do not know) in a race > condition. I speculate it is the fact that the developer likes to self > implement the thread concurrency controls at the metal rather than using the > OS facilities. The undocumented setting changes the thread synchronization > algorithm used from one optimized for single core which is incompatible in a > multi-core environment. To the developers credit, see item #3 for the > reason why thread synchronization performance is a big issue in xBase > dialects. My experience is Xbase++ performance is unacceptable when > attempting multi-core processing even if the lock issue was resolved. > Probably why it is not a big priority for Alaska to fix. Interesting. Such weak scalability is usually result of intensive critical section usage. In the last two years are invest a lot of time to eliminate such code so now only few small elements are covered by critical sections what should not reduce the performance on real multi CPU machines. > 2) PUBLICs are global, PRIVATEs and WORKAREAs are thread local. IMO, you > have flexibility in defining visibility scope as you add MT to harbour since > pre-existing non-MT code does not have the concept of thread local nor does > it have more than one thread. From my perspective, the xBase++ scoping is > correct as PUBLICs/PRIVATEs are rarely used in a professional architecture > and only necessary when they must be accessed by runtime symbolic > references. In current harbour code when thread is started user can define what should happen with existing memvar variables (PUBLICs and PRIVATEs). It's possible: 1. do not pass any PUBLICs and PRIVATEs to the child thread 2. copy PUBLICs and/or PRIVATEs to the child thread 3. share PUBLICs and/or PRIVATEs with the child thread Of course each child thread creating other threads can also make the same. When PRIVATEs are copied or shared with child thread then it see them as PUBLICs. > WORKAREAs could go either way, but it simplifies concurrency to > file locking for native file RDDs to have them thread local. Of course with In Harbour it will works in the same way. You can also move WA between threads. The xbase++ compatible functions dbRelease() and dbRequest() exist in harbour. In the future I'll add also support to clone workarea. Such cloned WA will use common file IO handles, buffers and file lock pool. I'm also thinking about adding pseudo exclusive mode which will be exclusive for other applications but cloned WA will be synced internally without file locks. > DBS based RDDs (such as ADS), the session layer may impose serialization > when shared across threads which leads me to conclude that the entire data > layer is thread local. It is because I use ADS and the session > serialization that I do not pass WORKAREAs between threads in Xbase++. The ADS RDD will have to be updated for MT mode. Not all things are safe. Especially there is a problem with some global settings passing to ADS library like _SET_DELETED. So far I haven't checked if ACE has some additional support for MT programs so I cannot say too much about. Anyhow this I'd like to leave for ADSRDD developers/users. I do not use it. > Additionally, you have to make a pass of your process state settings (ie. > SET commands) and decide whether they make more sense globally or thread > locally. When in doubt, I lean towards SETtings being thread local since > they should inherit from the thread that created them which makes them > behave like global settings if set appropriately in the first thread and > never changed again. If the SETting can have inter-thread side effects, it > should be thread local. In Harbour alls SETs are inherited from parent thread and each thread uses it's own copy of SET. > 3) The single core affinity referred to in item #1 above results in improved > process thoroughput in Xbase++. Unlike MT at the C level, the garbage > collected, shared memory system of xBase languages becomes a choke point due > to excessive serialization required to guarantee integrity. Every memory > variable accessed in every expression must be serialized to ensure the > validity of the access. Xbase++ uses the single core optimization that is > is not possible for two threads to attempt serialization in the exact same > clock cycle such that certain cpu primitives are guaranteed atomic to > complete without context switch. Hence, they reduce the overheads > associated the traditional hardware level, multi core synchronization > instructions. Memory serialization and concurrency is a well known issue > for C developers, but it is exasperated by the single shared memory > architecture of the xBase language. Harbour does not have such protection. I was thinking about it but it has to cost extremely much. On some architecture it will be real performance killer if we will call OS synchronization function. It will be necessary to use some assembler hacks to reduce it. Such elements will have to be discuss here. Give full complex item integrity protection with the cost of performance. I'm afraid that it will be hard to agree common decision. People familiar with MT programming will prefer missing protection and better performance because for all shared variables it's always necessary to add own protection and they used to make it correctly. But people who begins their adventure with MT programing each additional protection will be welcome. Anyhow for the answer we will have to wait for users' experiences after working with current code. Maybe we will chose some middle solution. F.e. synchronization reference item so user will be able to declare variable as automatically synced in source code or at RT by some function, f.e. hb_autoSyncVar( @var, <lOnOff> ) > 4) While I am not an implementor of the underlying memory managers, I have > contemplated how to overcome this problem in Xbase++. The best idea I have > to address this problem is to create variables in thread local memory heaps > that do not require thread synchronization to access. If a thread accesses > a variable in another threads heap, it would trigger a move event that would > relocate the variable to a shared, global heap which requires > synchronization on all accesses. Of course the move event would require > synchronization with the thread that owns the local heap. I believe that > this would work since the vast majority of variables are not shared between > threads. The GC would only need to synchronize with one thread at a time to > clean the thead local heaps and hopefully the shared, global heap would be > small enough that the GC disruption would be limited due to a quick run and > the possibility that the application threads might not access the global > heap (ie. be processing from the thread local heap) during its GC. The > downside of this architecture is that process memory may fragment quickly > due to multiple heaps and exhaust Win32 limits. Of course Win64 breaking > the 3GB process memory limit is just around the corner. I will have to think about it longer to give you an answer. Now I cannot say how it will work with long references chains when some of references are in thread local heap and some other in thread global. Moving complex item from local to global thread also does not look easy. F.e. some subarrays can still be accessible by local variables so we will have to mark all complex item nodes (it will benecesary to add circularity detection) that they belongs to common heap. Otherwise they will be still accessible without protection by other thread local variables. I also see some other problems which will have to be resolved. Now I do not find it as easy to implement solution. > 5) Finally, the only thread synchronization feature of Xbase++ at the .PRG > level is SYNC METHODs which serialize on the object instance (or class > object for CLASS SYNC METHODs). This is straight forward implementation > that requires nothing more than including a serialization mutex in every > class and instance object which is used by the respective SYNC METHODs. The > only caveat is that inter-object deadlocks can occur. AFAIK, Xbase++ does > not implement deadlock detection, but leaves it up to the app developer to > 'don't do that'. Since any variable or routine can be implemented as a > CLASS, I find the SYNC METHODs sufficient and elegant to satisfy MT > development at the .PRG layer. I'll add sync methods ASAP. It's rather simple job. Now user can use functions to create their own mutexes which are also condition variables and message queues so you can easy implement signal object in Harbour yourself. But I have one question. AFAIR in xbase++ the active locks set on executed SYNC methods are removed when thread executes oSignal:wait. Then they are restored - locked again. Maybe I do not understand some conditions but for me it's sth what can create deadlock if signaled thread have to restore more then one lock and it's out of user control. Is documented what xbase++ does in such case? Many thanks for your help. best regards, Przemek _______________________________________________ Harbour mailing list Harbour@harbour-project.org http://lists.harbour-project.org/mailman/listinfo/harbour