> Am 27.06.2018 um 15:08 schrieb Andrei Stebakov <lisper...@gmail.com>: > > Thank you guys for your insiteful answers. I wish we could have some kind of > article summarizing those approaches so that next devs wouldn't have to > reinvent the wheel but start with some tried approach and maybe improve it. > As I only scratched the surface learning Pharo, I may have some naive > questions. > Does the fact (fact?) that Pharo uses green threads (not native os threads) > impact the performance?
Yes and no. There is nothing wrong with green threads. They are super lightweight and enable some sort of parallelism. If you look at Erlang/OTP it handles ten thousands of green threads easily. The performance bottleneck is due to the fact that you cannot utilize multiple cores of a CPU. So it is usual to have some images being spread out to separate cores and the images handle things concurrently. > With two Pharo images running in parallel on two core system, how does it > handle multiple requests at a time? There must always be some unblocked > thread waiting for connections and delegating requests to request handlers in > different green threads (using fork operation). Is my understanding correct? Not completely. It is also a green thread accepting connections. The priority is given due to a socket waiting on a system resource that gets signalled if a connection comes in. > So even if one of those threads has to wait on a long IO operation (say from > DB2) that shouldn't impact the performance of other handlers? Exactly. That is the way through orchestration to have maximum throughput. > I think that in most cases the CPU time for request processing is minal as > the bottleneck is in lengthy IO operations , DB waits and calling external > REST-ful services. So two images on two cores should be enough to handle > hundreds of simultaneous requests since most of the times the threads will > wait on external operations, not using the local CPU. Yes, it depends on the use case of course. > Please let me know if this summary that I got from this thread makes sense. > Yes, I fully agree that using docker pharo containers under some load > balancing is the way to go. > I think your summary is pretty accurate. Docker also the advantage that it uses a lot of shared memory. So starting 100 pharo images most resources including the vm are in memory only once. Hope it helps, Norbert >> On Wed, Jun 27, 2018, 04:10 jtuc...@objektfabrik.de >> <jtuc...@objektfabrik.de> wrote: >> Norbert, >> >> >> thanks for your insighgts, explanations and thoughts. It is good to read and >> learn from people who are a step or two ahead... >> >>> Am 27.06.18 um 09:31 schrieb Norbert Hartl: >>> Joachim, >>> >>>> Am 27.06.2018 um 07:42 schrieb jtuc...@objektfabrik.de: >>>> >>>> Norbert, >>>> >>>>> Am 26.06.18 um 21:41 schrieb Norbert Hartl: >>>>> >>>>> >>>>> Am 26.06.2018 um 20:44 schrieb Andrei Stebakov <lisper...@gmail.com>: >>>>> >>>>>> What would be an example for load balancer for Pharo images? Can we run >>>>>> multiple images on the same server or for the sake of balancing >>>>>> configuration we can only run one image per server? >>>>>> >>>>> There are a lot of possibilities. You can start multiple images on >>>>> different ports and use nginx with an upstream rule to load balance. I >>>>> would recommend using docker for spawning multiple images on a host. >>>>> Again with nginx as frontend load balancer. The point is that you can >>>>> have at least twice as muh inages running then you have CPU cores. And of >>>>> course a lot more. >>>>> >>>> >>>> the last time I checked nginx, the load balancing and sticky session stuff >>>> was not available in the free edition. So I guess you either pay for nginx >>>> (which I think is good) or you know some free 3d party addons... >>>> >>> there is the upstream module which provides load balancing. But you are >>> right I think sticky sessions is not part of it. The closest you get IIRC >>> is IP based hashing. >> I see. >> >>> >>>> I wonder what exactly the benefit of Docker is in that game? On our >>>> servers we run 10 images on 4 cores with HT (8 virtual cores) and very >>>> rareley have real performance problems. We use Glorp, so there is a lot of >>>> SQL queriing going on for quite basic things already. So my guess would be >>>> that your "2 images per core" are conservative and leave air for even a >>>> third one, depending on all the factors already discussed here. >>>> >>> Docker is pretty nice. You can have the exact same deployment artefact >>> started multiple times. I used tools like daemontools, monit, etc. >>> before but starting the image, assigning ports etc. you have to do yourself >>> which is cumbersome and I don’t like any of those tools anymore. If you >>> created your docker image you can start that multiple times because >>> networking is virtualized all images can have the same port serving e.g. >> >> oh, I see. This is a plus. We're not using any containers and have to >> provide individual configurations for each image we start up. Works well, >> not too many moving parts (our resources are very limited) and we try to >> keep things as simple as possible. As long as we can live with providing a >> statically sized pool of machines and images and load doesn't vary too much, >> this is not too bad. But once you need to dynamically add and remove images >> for coping with load peeks and lows, our approach will probably become >> cumbersome and complicated. >> OTOH, I guess usind Docker just means solving the same problems on another >> level - but I guess there are lots of toosl in the Container area that can >> help here (like the trafik thing mentioned in another thread). >> >>> >>> I think talking about performance these days is not easy. Modern machines >>> are so fast that you need a lot of users before you experience any problems. >> ... depending on your usage of resources. As I said, we're using SQL heavily >> because of the way Glorp works. So it is easy to introduce bottlenecks even >> for smaller jobs. >>> The mention of „2 images per core“ I need to explain. A CPU core can >>> execute only one thing at a time. Therefor 1 image per core would be >>> enough. The second one is for that time slices where there are gaps in >>> processing meaning the process is suspended, switched etc. It is just the >>> rule of thumb that it is good to have one process waiting in the scheduling >>> queue so it can step in as soon as there is free cycles. The „2 images per >>> core“ have the assumption that you can put an arbitrary load on one image. >>> So with this assumption a third image won’t give you anything because it >>> cannot do anything the other two images cannot do. >>> So according to the „hard“ facts it does not help having more than two >>> images. On the other hand each image is single threaded and using more >>> images lowers the probability that processes get blocked because they are >>> executed within one image. On yet another hand if you use a database a lot >>> of the time for a process is waiting for the response of the database so >>> other processes can be executed. And and and…. So in the end you have to >>> try it. >> >> You are correct. The third image con anly jump in if both the others are in >> a wait state. It "feels" as if there was enough air for a third one to >> operate, but we'd have to try if that holds true. >> >>> >>>> What's not to underestimate is all the stuff around monitoring and >>>> restarting images when things go wrong, but that's another story... >>>> >>> Docker has a restart policy so restarting shouldn’t be an issue with it. >>> Monitoring is always hard. I use prometheus with grafana but that is quite >>> a bit to set up. But in the end you get graphs and you can define alerts >>> for system value thresholds. >> Well, that is also true for monit (which we use), the question always is: >> what do you make of those numbers. We have situations in whcih an Image >> responds to http requests as if all were good. But for some reason, DB2 >> sometimes takes forever to answer queries, and will probably answer with a >> "cannot handle requests at this time" after literally a minute or so. Other >> DB connections work well in parallel. We're still looking for ways to >> recognize such situations externally (and think about moving from DB2 to >> PostgreSQL). >> >>> If the topic gets accepted Marcus and me will tell about these things at >>> ESUG. >> >> So if anybody from the program committee is reading this: Please accept and >> schedule Norbert's and Macus' talk, I'll be sticking to their lips and I >> guess I won't be alone ;-) >> >> >> Joachim >> >>> >>> >>> Norbert >>>> Joachim >>>> >>>> >>>> -- >>>> ----------------------------------------------------------------------- >>>> Objektfabrik Joachim Tuchel mailto:jtuc...@objektfabrik.de >>>> Fliederweg 1 http://www.objektfabrik.de >>>> D-71640 Ludwigsburg http://joachimtuchel.wordpress.com >>>> Telefon: +49 7141 56 10 86 0 Fax: +49 7141 56 10 86 1 >>>> >>> >> >> -- >> ----------------------------------------------------------------------- >> Objektfabrik Joachim Tuchel mailto:jtuc...@objektfabrik.de >> Fliederweg 1 http://www.objektfabrik.de >> D-71640 Ludwigsburg http://joachimtuchel.wordpress.com >> Telefon: +49 7141 56 10 86 0 Fax: +49 7141 56 10 86 1 >>