Re: [Harbour] Re: Intresting corner of info

Viktor Szakáts Mon, 23 Feb 2009 05:04:38 -0800

Hi Przemek,

Thanks for this thoughtful conversation.
Just a few things that sprang to my mind regarding current MT
issues in Harbour while reading both Steffen's and your text:


- hbct windowing has the current window as an app global
  data. This makes it difficult (or even impossible?) to
  start a new thread which operates in its own windows,
  as the threads are stealing focus from each other.
  (main app thread focus changes are difficult to protect
  by mutexes)
- hbct has other .c parts which are non-MT safe.
- Moving static to "thread static" on the .c level can be
  a need for other 3rd party code.

This leads to one of the NOTEs I've made in ChangeLog:
Should Harbour have a way to attach custom information
to a thread? This may give the path to solve some of these
problems, so it seems to me at least.

[ Maybe I'm banging open doors and the support is there,
but couldn't find it, yet. ]

Brgds,
Viktor

On Mon, Feb 23, 2009 at 1:37 PM, Przemyslaw Czerpak <dru...@acn.waw.pl>wrote:

> On Mon, 23 Feb 2009, Massimo Belgrano wrote:
>
> Hi,
>
> > Follow prev
> > Multithreading as the ability to execute code in different code paths
> > is a feature of modern OS sinces decades. The problem with MT is that
> > it adds another dimension of complexity to the developers task. While
> > with single threaded apps. the developer needs only to think in a more
> > or less sequential way with MT each execution path adds a new
> > dimentions to the equation of programm complexity. Development
> > languages supporting MT such as Delphi, .NET (C#,VB) or Harbour and
> > xHarbour support MT thats correct, but they do not remove the burden
> > of correctness from the programmer. It is in the sole responsibility
> > of the programmer to ensure programm correctness in two different
> > areas; data-consistency and algorithm isolation.
>
> I agree,
>
> > The problem of data consistency occurs as soon as more than one thread
> > is accessing the same data - such as a simple string or an array.
> > Besides nuances in terms of single or multiple readers/writers the
> > consistency of the data must be ensured, so developers are forced to
> > use mutex-semaphores or other higher level concepts such monitors,
> > guards... to ensure data-consistency.
>
> Yes, usually they are though different languages gives some additional
> protection mechanisms here so not always is necessary to use user level
> synchronization.
>
> > Algorithm isolation is somewhat related to data-consistency, it
> > becomes obvious that a linked-list accessed from multiple threads must
> > be protected otherwise dangling pointer occurs. But what about a
> > table/relation of a database. The problem here is that concurrency
> > inside the process can be resolved - but this type of "isolation" does
> > break the semantics of the isolation principles which are already
> > provided by the underlying dbms (sql-isolation-levels, record or file
> > locks, transactions). Therefore algorithm isolation/correctness is a
> > complete different beast as it is located at a very high semantic
> > level of the task.
>
> yes, it is.
>
>
> > Alaska Software has put an enormous amount of research efforts into
> > that area and we have more than a decade of practical experience with
> > that area based on real world customers and real world applications.
> > >From that point of view I would like to reiterate my initial statement
> > "As of today there is still no tool available in the market which
> > provides that clean and easy to use way of multithreading".
>
> I was not making such "enormous amount of research efforts" ;-)
> Just simply looked at good balance between performance, basic
> protection and flexibility for programmers.
>
> > Lets start with xHarbour, its MT implementation is not well thought,
> > as it provides MT features to the programmer without any model, just
> > the features. xHarbour even allows the usage of a workarea from
> > different threads which is a violation of fundamental dbms isolation
> > principles. In fact xHarbour ist just a system language in the sense
> > of MT and makes life not really easier compared with other system
> > languages. Therefore there is no value in besides being able to do MT.
> > Also keep in mind due to the historical burden of the VM and RT core
> > the MT feature is implemented in a way making it impossible to scale
> > in future multi-core scenarios (see later-note).
>
> I agree. Giving the unprotected access to workareas is asking for a
> troubles. It can create very serious problems (f.e. data corruption
> in tables) and gives nothing for programmers because they have to use
> own protection mechanisms to access the tables so final application
> have to be reduced to the same level as using dbRequest()/dbRelease()
> to lock/unlock the table. The difference is only that in such model
> programmer has to implement everything itself.
>
> > Harbour is better here because if follows more the principles of
> > Xbase++, while I am not sure if the Harbour people have decided to
> > adapt the Xbase++ model for compatibility reasons or not I am glad to
> > see that they followed our models point of view. The issues with
> > Harbour however is that it suffers from the shortcoming of its runtime
> > in general, the VM design and of course the way how datatypes - the
> > blood of a language - are handled. It is still in a 1980 architectual
> > style centered around the original concept how Clipper did it. This is
> > also true for xHarbour, so both suffer from the fact that MT was added
> > I think in 2007, while the VM and RT core is from 1999 - without
> > having MT in mind.
>
> Here I can agree only partially.
> 1-st Harbour does not follow xbase++ model. With the exception to
> xbase++ emulation level (xbase++ sync and thread classes, thread
> functions and sync methods) the whole code is the result of my own
> ideas. The only one idea I partially borrowed is dbRequest()/dbRelase()
> semantic. Personally I wanted to introduce many workarea holders
> (not only single zero area zone) and dbDetach()/dbAttach() functions.
> Later I heard about xbase++ implementation and I've found the cargo
> codeblock attaching as very nice feature so I implemented it but
> internally it operates on workarea sets from my original idea and
> still it's possible to introduce support for multiple WA zones if
> we decide to add .prg level API for it. In some cases it maybe usable.
> Also the internal WA isolation in native RDDs is different. For POSIX
> systems it's necessary to introduce file handle sharing and this
> mechanism is already used so now we can easy extended it adding support
> for pseudo exclusive mode (other threads will be able to access tables
> open in exclusive mode which is exclusive only for external programs)
> or add common to aliased WA caches.
> Of course Harbour supports also other xbase++ extensions but they were
> added rather for compatibility with xbase++ on xbase++ users and internally
> use basic Harbour MT API.
>
> 2-nd this old API from 1980 is a real problem in some places and probably
> will be good to change it. But I also do not find the xbase++ API
> as the only one final solution. Harbour gives full protection for read
> access to complex items. User have to protect only write access
> and only if he will want to change exactly the same item not
> complex item member, f.e. this code:
>   aVal[ threadID() ] += aVal[ threadID() ] * 2 + 100
> is MT safe in Harbour even if the same aVal is used by many different
> threads. Important is the fact that each thread operates on different
> aVal items and aVal is not resized. Otherwise it may cause data corruption.
> But when complex items can be resized the we usually need additional
> protection also in xbase++ because user code makes many operations which
> have to be atomic in some logical sense so in most of cases there is
> only one difference here between Harbour and xbase++: in xbase++ with
> full internal protection and missing user protection RT error is generated.
> In Harbour it may cause internal data corruption. I agree here that it's
> very important difference but in mouse of such cases we are talking about
> wrong user code which needs additional user protection in both languages.
> And here we have one fundamental question:
>   What is the cost of internal protection for scalability?
> and if we can or cannot accept it. My personal feeling is that the cost
> will be high, even very high but I haven't made any tests myself though
> some xbase++ users confirmed that it's a problem in xbase++.
> I'm really interested in some scalability tests of xbase++ and Harbour.
> It could give few very important answers. If some xbase++ user can port
> tests/speedtst.prg to xbase++ then it will be very helpful.
>
> Of course it's possible that I missed something here but I've never used
> xbase++ and I cannot see its source code so I only guess how some things
> are implemented in this language.
>
> > This is in fact one of the biggest differences between Xbase++ and the
> > "Harbours" from a pure architectual point of view, we designed a
> > runtime architecture from the beginning to be MT/MP and Distributed,
> > they designed a runtime based on the DOS Clipper blueprint.
> > In fact, I could argue on and on, specifically it it comes to
> > dedicated implementations of the Harbour runtime core or the Harbour
> > VM but sharing these type of technical details is of course
> > definitively not what I am paid for -;) Anyway allow me to make it
> > clear in a general terms.
>
> See above. It's not such clear as you said.
> I think that you will find users which can say that the cost of
> scalability is definitively not what they be paid for. Especially
> when the missing user protection is also problem for xbase++ and
> the bad results are only different. For sure RT error is much better
> then internal data corruption but how much users can paid for such
> functionality.
>
> > First, any feature/functionality of Xbase++ is reentrant there is not
> > a single exception of this rule. Second, any datatype and storage type
> > is thread-safe regardless of its complexity so there is no way to
> > crash an Xbase++ process using multithreading. Third, the runtime
> > guarantees that there is no possibility of a deadlock in terms of its
> > internal state regardless what you are doing in different threads.
> > There is a clean isolation and inheritance relationship of settings
> > between different threads. In practical terms that means, you can
> > output to the console from different threads without any additional
> > code, you can execute methods or access state of GUI (XbasePARTS)
> > objects from different threads, you can create a codeblock which
> > detaches a local variable and pass it to another thread, you are
> > performing file I/O or executing a remote procedure call and in the
> > meanwhile the async. garbagge collector cleans up your memory - and
> > the list goes on... But in Xbase++ you can do all that without the
> > need to think about MT or ask a question such as "Is the ASort()
> > function thread safe" or can I change the caption of a GUI control
> > from another thread. Thats all a given, no restrictions apply, the
> > runtime does it all automatically for you.
>
> Most of the above is also true in Harbour with the exception to
> missing GUI components and obligatory internal item storage protection.
> But it's the subject of efficiency discussed above.
> Let's make some scalability tests and we can decide if we want to pay
> the same cost of xbase++ users.
>
> > Anyway, I like Harbour more than xHarbour in terms of MT support.
> > However the crux is still there, no real architecture around the
> > product, leading to the fact that MT is supported form a technical
> > point of view but not from a 4GL therefore leading to a potential of
> > unnecessary burden for the average programmers, and of course that was
> > and is still not the idea of Clipper as a tool.
>
> The only one fundamental difference between Harbour and xbase++ in the
> above is obligatory internal items protection. At least visible for me
> now and as I said the cost of such functionality may not be acceptable
> for users. But let's make some real tests to see how big problem it
> creates in real life.
>
> > Btw, the same is true for VO or so, they left the idea of the language
> > and moved to something more like a system -language, while I can
> > understand that somewhat I strongly disagree with that type of
> > language design for a simple reasons; its not practical in the long
> > term - we will see that in the following years as more and more multi
> > core system will find their way in the mainstream and developers need
> > to make use of them for performance and scaleability reasons. In 10 -
> > 15 years from now we will have 100 if not thousands cores per die -
> > handling multithreading , synchronisation issues by hand becomes then
> > impossible, the same is true for offloading tasks for performance
> > reasons. So there is a need for a clean model in terms of the language
> > - thats at least into what we believe at Alaska Software. It goes even
> > further, the current attempty by MS in terms of multicore support with
> > VS2010 or NET 4.0 are IMO absolutely wrong, as they force the
> > developer to write code depending on the underlaying execution
> > infrastructure alias cores available. In other words, infrastructure
> > related code/algorithms get mixed with the original algorithm the
> > developers writes and of course the developer gets payed for. Thats a
> > catastrophic path which for sure does not contribute to increased
> > productivity and reliability of software solutions.
>
> I agree with you only partially. Over some reasonable cost limit
> the MT programing stops to be usable and is much more efficient,
> safer and easier to use separated processes. The cost of data
> exchanging between them will be simply smaller the cost of internal
> obligatory MT synchronization. So why to use MT mode? For marketing
> reasons?
>
> > Funnily enough, the most critical, and most difficult aspect in that
> > area; getting performance gains from multi core usage is even not
> > touched with my technical arguments right now. However it adds another
> > dimension of complexity to the previous equation as it needs to take
> > into account the memory hierarchy which must be handled by a 4GL
> > runtime totally different as it is with the simple approach of
> > Harbour/xHarbour. Their RT core and VM needs a more or less complete
> > rewrite and redesign to go that path.
>
> I do not see bigger problems with Harbour core code modifications.
> If we decide that it's worth then I'll implement it.
> Probably the real problem will be forcing different API to 3-rd party
> developers. Here we probably should chose something close to xbase++
> C API to not introduce additional problems for 3-rd party developers
> which have to create code for both projects to have some basic
> compatibility f.e. at C preprocessor level.
> Anyhow I'm still not sure I want to pay for the cost of full item
> access serialization.
>
> > In other words, Xbase++ is playing in the Multithreading ballpark
> > since a decade. Harbour is still finding its way into the MT ballpark
> > while xHarbour is in that context at a dead-end.
> > I would bet that Xbase++ will play in the multicore ballpack while the
> > Harbours are still with their MT stuff.
>
> And it's highly possible that it will happen. But Harbour is free
> project and if we decide that adding full item protection with the
> cost of speed is valuable feature then maybe we implement it.
> It's also possible that we add such functionality as alternative VM
> library. Just like now we have hbvm and hbvmmt we will have hbvmpmt
> (protected mt).
>
> > In a more theoretical sense, it is important to understand that a
> > programming language and its infrastructure shall not adapt any
> > technical feature, requirement or hype. Because then the language and
> > infrastucture are getting more and more complicated up to an point of
> > lost control. Also backward compatibility and therefore protection of
> > existing investments becomes more and more a mess with Q&A costs going
> > through the roof.
>
> _FULLY_AGREE_. Things should be as simple as possible. Any hacks or
> workarounds for single features in longer terms create serious problems
> and blocks farther developing.
> For me it was the main of xHarbour problem when I was working on this
> project.
>
> > Nor is it a good idea to provide software developers any freedom - the
> > point here is, a good MT modell does smoothly guide the developer
> > through the hurdels and most of the time is even not in the awareness
> > of the developer. The contrary is providing the developer all freedom,
> > but this leads to letting him first build the gun-powder, then the gun
> > to finally shoot a bullet -;)
>
> :-)
>
> > Therefore let me rephrase my initial statement to be more specific;
> >
> > As of today there is still no tool available in the market which
> > provides that clean and easy to use way of multithreading, however
> > there are other tools which support MT - but they support it just as
> > an technical feature without a modell and thats simple wrong as it
> > leads to additional dimensions in code complexity - finally ending in
> > applications with lesser reliability and overall quality.
> > Just my point of view on that subject - enough said
>
> Thank you very much for this very interesting text.
> I hope that now the main internal difference between Harbour and xbase++
> is well visible for users.
> To the above we should add yet tests/speedtst.prg results to compare
> scalability so we will know the real cost which is important part of
> the above description.
> I'm very interesting in real life results and I hope that some xbase++
> users will port tests/speedtst.prg to xbase++ so we can compare the
> results.
>
> best regards,
> Przemek
> _______________________________________________
> Harbour mailing list
> Harbour@harbour-project.org
> http://lists.harbour-project.org/mailman/listinfo/harbour
>

_______________________________________________
Harbour mailing list
Harbour@harbour-project.org
http://lists.harbour-project.org/mailman/listinfo/harbour

Re: [Harbour] Re: Intresting corner of info

Reply via email to