Hi Przemek, Thanks for this thoughtful conversation. Just a few things that sprang to my mind regarding current MT issues in Harbour while reading both Steffen's and your text:
- hbct windowing has the current window as an app global data. This makes it difficult (or even impossible?) to start a new thread which operates in its own windows, as the threads are stealing focus from each other. (main app thread focus changes are difficult to protect by mutexes) - hbct has other .c parts which are non-MT safe. - Moving static to "thread static" on the .c level can be a need for other 3rd party code. This leads to one of the NOTEs I've made in ChangeLog: Should Harbour have a way to attach custom information to a thread? This may give the path to solve some of these problems, so it seems to me at least. [ Maybe I'm banging open doors and the support is there, but couldn't find it, yet. ] Brgds, Viktor On Mon, Feb 23, 2009 at 1:37 PM, Przemyslaw Czerpak <dru...@acn.waw.pl>wrote: > On Mon, 23 Feb 2009, Massimo Belgrano wrote: > > Hi, > > > Follow prev > > Multithreading as the ability to execute code in different code paths > > is a feature of modern OS sinces decades. The problem with MT is that > > it adds another dimension of complexity to the developers task. While > > with single threaded apps. the developer needs only to think in a more > > or less sequential way with MT each execution path adds a new > > dimentions to the equation of programm complexity. Development > > languages supporting MT such as Delphi, .NET (C#,VB) or Harbour and > > xHarbour support MT thats correct, but they do not remove the burden > > of correctness from the programmer. It is in the sole responsibility > > of the programmer to ensure programm correctness in two different > > areas; data-consistency and algorithm isolation. > > I agree, > > > The problem of data consistency occurs as soon as more than one thread > > is accessing the same data - such as a simple string or an array. > > Besides nuances in terms of single or multiple readers/writers the > > consistency of the data must be ensured, so developers are forced to > > use mutex-semaphores or other higher level concepts such monitors, > > guards... to ensure data-consistency. > > Yes, usually they are though different languages gives some additional > protection mechanisms here so not always is necessary to use user level > synchronization. > > > Algorithm isolation is somewhat related to data-consistency, it > > becomes obvious that a linked-list accessed from multiple threads must > > be protected otherwise dangling pointer occurs. But what about a > > table/relation of a database. The problem here is that concurrency > > inside the process can be resolved - but this type of "isolation" does > > break the semantics of the isolation principles which are already > > provided by the underlying dbms (sql-isolation-levels, record or file > > locks, transactions). Therefore algorithm isolation/correctness is a > > complete different beast as it is located at a very high semantic > > level of the task. > > yes, it is. > > > > Alaska Software has put an enormous amount of research efforts into > > that area and we have more than a decade of practical experience with > > that area based on real world customers and real world applications. > > >From that point of view I would like to reiterate my initial statement > > "As of today there is still no tool available in the market which > > provides that clean and easy to use way of multithreading". > > I was not making such "enormous amount of research efforts" ;-) > Just simply looked at good balance between performance, basic > protection and flexibility for programmers. > > > Lets start with xHarbour, its MT implementation is not well thought, > > as it provides MT features to the programmer without any model, just > > the features. xHarbour even allows the usage of a workarea from > > different threads which is a violation of fundamental dbms isolation > > principles. In fact xHarbour ist just a system language in the sense > > of MT and makes life not really easier compared with other system > > languages. Therefore there is no value in besides being able to do MT. > > Also keep in mind due to the historical burden of the VM and RT core > > the MT feature is implemented in a way making it impossible to scale > > in future multi-core scenarios (see later-note). > > I agree. Giving the unprotected access to workareas is asking for a > troubles. It can create very serious problems (f.e. data corruption > in tables) and gives nothing for programmers because they have to use > own protection mechanisms to access the tables so final application > have to be reduced to the same level as using dbRequest()/dbRelease() > to lock/unlock the table. The difference is only that in such model > programmer has to implement everything itself. > > > Harbour is better here because if follows more the principles of > > Xbase++, while I am not sure if the Harbour people have decided to > > adapt the Xbase++ model for compatibility reasons or not I am glad to > > see that they followed our models point of view. The issues with > > Harbour however is that it suffers from the shortcoming of its runtime > > in general, the VM design and of course the way how datatypes - the > > blood of a language - are handled. It is still in a 1980 architectual > > style centered around the original concept how Clipper did it. This is > > also true for xHarbour, so both suffer from the fact that MT was added > > I think in 2007, while the VM and RT core is from 1999 - without > > having MT in mind. > > Here I can agree only partially. > 1-st Harbour does not follow xbase++ model. With the exception to > xbase++ emulation level (xbase++ sync and thread classes, thread > functions and sync methods) the whole code is the result of my own > ideas. The only one idea I partially borrowed is dbRequest()/dbRelase() > semantic. Personally I wanted to introduce many workarea holders > (not only single zero area zone) and dbDetach()/dbAttach() functions. > Later I heard about xbase++ implementation and I've found the cargo > codeblock attaching as very nice feature so I implemented it but > internally it operates on workarea sets from my original idea and > still it's possible to introduce support for multiple WA zones if > we decide to add .prg level API for it. In some cases it maybe usable. > Also the internal WA isolation in native RDDs is different. For POSIX > systems it's necessary to introduce file handle sharing and this > mechanism is already used so now we can easy extended it adding support > for pseudo exclusive mode (other threads will be able to access tables > open in exclusive mode which is exclusive only for external programs) > or add common to aliased WA caches. > Of course Harbour supports also other xbase++ extensions but they were > added rather for compatibility with xbase++ on xbase++ users and internally > use basic Harbour MT API. > > 2-nd this old API from 1980 is a real problem in some places and probably > will be good to change it. But I also do not find the xbase++ API > as the only one final solution. Harbour gives full protection for read > access to complex items. User have to protect only write access > and only if he will want to change exactly the same item not > complex item member, f.e. this code: > aVal[ threadID() ] += aVal[ threadID() ] * 2 + 100 > is MT safe in Harbour even if the same aVal is used by many different > threads. Important is the fact that each thread operates on different > aVal items and aVal is not resized. Otherwise it may cause data corruption. > But when complex items can be resized the we usually need additional > protection also in xbase++ because user code makes many operations which > have to be atomic in some logical sense so in most of cases there is > only one difference here between Harbour and xbase++: in xbase++ with > full internal protection and missing user protection RT error is generated. > In Harbour it may cause internal data corruption. I agree here that it's > very important difference but in mouse of such cases we are talking about > wrong user code which needs additional user protection in both languages. > And here we have one fundamental question: > What is the cost of internal protection for scalability? > and if we can or cannot accept it. My personal feeling is that the cost > will be high, even very high but I haven't made any tests myself though > some xbase++ users confirmed that it's a problem in xbase++. > I'm really interested in some scalability tests of xbase++ and Harbour. > It could give few very important answers. If some xbase++ user can port > tests/speedtst.prg to xbase++ then it will be very helpful. > > Of course it's possible that I missed something here but I've never used > xbase++ and I cannot see its source code so I only guess how some things > are implemented in this language. > > > This is in fact one of the biggest differences between Xbase++ and the > > "Harbours" from a pure architectual point of view, we designed a > > runtime architecture from the beginning to be MT/MP and Distributed, > > they designed a runtime based on the DOS Clipper blueprint. > > In fact, I could argue on and on, specifically it it comes to > > dedicated implementations of the Harbour runtime core or the Harbour > > VM but sharing these type of technical details is of course > > definitively not what I am paid for -;) Anyway allow me to make it > > clear in a general terms. > > See above. It's not such clear as you said. > I think that you will find users which can say that the cost of > scalability is definitively not what they be paid for. Especially > when the missing user protection is also problem for xbase++ and > the bad results are only different. For sure RT error is much better > then internal data corruption but how much users can paid for such > functionality. > > > First, any feature/functionality of Xbase++ is reentrant there is not > > a single exception of this rule. Second, any datatype and storage type > > is thread-safe regardless of its complexity so there is no way to > > crash an Xbase++ process using multithreading. Third, the runtime > > guarantees that there is no possibility of a deadlock in terms of its > > internal state regardless what you are doing in different threads. > > There is a clean isolation and inheritance relationship of settings > > between different threads. In practical terms that means, you can > > output to the console from different threads without any additional > > code, you can execute methods or access state of GUI (XbasePARTS) > > objects from different threads, you can create a codeblock which > > detaches a local variable and pass it to another thread, you are > > performing file I/O or executing a remote procedure call and in the > > meanwhile the async. garbagge collector cleans up your memory - and > > the list goes on... But in Xbase++ you can do all that without the > > need to think about MT or ask a question such as "Is the ASort() > > function thread safe" or can I change the caption of a GUI control > > from another thread. Thats all a given, no restrictions apply, the > > runtime does it all automatically for you. > > Most of the above is also true in Harbour with the exception to > missing GUI components and obligatory internal item storage protection. > But it's the subject of efficiency discussed above. > Let's make some scalability tests and we can decide if we want to pay > the same cost of xbase++ users. > > > Anyway, I like Harbour more than xHarbour in terms of MT support. > > However the crux is still there, no real architecture around the > > product, leading to the fact that MT is supported form a technical > > point of view but not from a 4GL therefore leading to a potential of > > unnecessary burden for the average programmers, and of course that was > > and is still not the idea of Clipper as a tool. > > The only one fundamental difference between Harbour and xbase++ in the > above is obligatory internal items protection. At least visible for me > now and as I said the cost of such functionality may not be acceptable > for users. But let's make some real tests to see how big problem it > creates in real life. > > > Btw, the same is true for VO or so, they left the idea of the language > > and moved to something more like a system -language, while I can > > understand that somewhat I strongly disagree with that type of > > language design for a simple reasons; its not practical in the long > > term - we will see that in the following years as more and more multi > > core system will find their way in the mainstream and developers need > > to make use of them for performance and scaleability reasons. In 10 - > > 15 years from now we will have 100 if not thousands cores per die - > > handling multithreading , synchronisation issues by hand becomes then > > impossible, the same is true for offloading tasks for performance > > reasons. So there is a need for a clean model in terms of the language > > - thats at least into what we believe at Alaska Software. It goes even > > further, the current attempty by MS in terms of multicore support with > > VS2010 or NET 4.0 are IMO absolutely wrong, as they force the > > developer to write code depending on the underlaying execution > > infrastructure alias cores available. In other words, infrastructure > > related code/algorithms get mixed with the original algorithm the > > developers writes and of course the developer gets payed for. Thats a > > catastrophic path which for sure does not contribute to increased > > productivity and reliability of software solutions. > > I agree with you only partially. Over some reasonable cost limit > the MT programing stops to be usable and is much more efficient, > safer and easier to use separated processes. The cost of data > exchanging between them will be simply smaller the cost of internal > obligatory MT synchronization. So why to use MT mode? For marketing > reasons? > > > Funnily enough, the most critical, and most difficult aspect in that > > area; getting performance gains from multi core usage is even not > > touched with my technical arguments right now. However it adds another > > dimension of complexity to the previous equation as it needs to take > > into account the memory hierarchy which must be handled by a 4GL > > runtime totally different as it is with the simple approach of > > Harbour/xHarbour. Their RT core and VM needs a more or less complete > > rewrite and redesign to go that path. > > I do not see bigger problems with Harbour core code modifications. > If we decide that it's worth then I'll implement it. > Probably the real problem will be forcing different API to 3-rd party > developers. Here we probably should chose something close to xbase++ > C API to not introduce additional problems for 3-rd party developers > which have to create code for both projects to have some basic > compatibility f.e. at C preprocessor level. > Anyhow I'm still not sure I want to pay for the cost of full item > access serialization. > > > In other words, Xbase++ is playing in the Multithreading ballpark > > since a decade. Harbour is still finding its way into the MT ballpark > > while xHarbour is in that context at a dead-end. > > I would bet that Xbase++ will play in the multicore ballpack while the > > Harbours are still with their MT stuff. > > And it's highly possible that it will happen. But Harbour is free > project and if we decide that adding full item protection with the > cost of speed is valuable feature then maybe we implement it. > It's also possible that we add such functionality as alternative VM > library. Just like now we have hbvm and hbvmmt we will have hbvmpmt > (protected mt). > > > In a more theoretical sense, it is important to understand that a > > programming language and its infrastructure shall not adapt any > > technical feature, requirement or hype. Because then the language and > > infrastucture are getting more and more complicated up to an point of > > lost control. Also backward compatibility and therefore protection of > > existing investments becomes more and more a mess with Q&A costs going > > through the roof. > > _FULLY_AGREE_. Things should be as simple as possible. Any hacks or > workarounds for single features in longer terms create serious problems > and blocks farther developing. > For me it was the main of xHarbour problem when I was working on this > project. > > > Nor is it a good idea to provide software developers any freedom - the > > point here is, a good MT modell does smoothly guide the developer > > through the hurdels and most of the time is even not in the awareness > > of the developer. The contrary is providing the developer all freedom, > > but this leads to letting him first build the gun-powder, then the gun > > to finally shoot a bullet -;) > > :-) > > > Therefore let me rephrase my initial statement to be more specific; > > > > As of today there is still no tool available in the market which > > provides that clean and easy to use way of multithreading, however > > there are other tools which support MT - but they support it just as > > an technical feature without a modell and thats simple wrong as it > > leads to additional dimensions in code complexity - finally ending in > > applications with lesser reliability and overall quality. > > Just my point of view on that subject - enough said > > Thank you very much for this very interesting text. > I hope that now the main internal difference between Harbour and xbase++ > is well visible for users. > To the above we should add yet tests/speedtst.prg results to compare > scalability so we will know the real cost which is important part of > the above description. > I'm very interesting in real life results and I hope that some xbase++ > users will port tests/speedtst.prg to xbase++ so we can compare the > results. > > best regards, > Przemek > _______________________________________________ > Harbour mailing list > Harbour@harbour-project.org > http://lists.harbour-project.org/mailman/listinfo/harbour >
_______________________________________________ Harbour mailing list Harbour@harbour-project.org http://lists.harbour-project.org/mailman/listinfo/harbour