Thank you guys for your insiteful answers. I wish we could have some kind of article summarizing those approaches so that next devs wouldn't have to reinvent the wheel but start with some tried approach and maybe improve it. As I only scratched the surface learning Pharo, I may have some naive questions. Does the fact (fact?) that Pharo uses green threads (not native os threads) impact the performance? With two Pharo images running in parallel on two core system, how does it handle multiple requests at a time? There must always be some unblocked thread waiting for connections and delegating requests to request handlers in different green threads (using fork operation). Is my understanding correct? So even if one of those threads has to wait on a long IO operation (say from DB2) that shouldn't impact the performance of other handlers? I think that in most cases the CPU time for request processing is minal as the bottleneck is in lengthy IO operations , DB waits and calling external REST-ful services. So two images on two cores should be enough to handle hundreds of simultaneous requests since most of the times the threads will wait on external operations, not using the local CPU. Please let me know if this summary that I got from this thread makes sense. Yes, I fully agree that using docker pharo containers under some load balancing is the way to go.
On Wed, Jun 27, 2018, 04:10 jtuc...@objektfabrik.de <jtuc...@objektfabrik.de> wrote: > Norbert, > > > thanks for your insighgts, explanations and thoughts. It is good to read > and learn from people who are a step or two ahead... > > Am 27.06.18 um 09:31 schrieb Norbert Hartl: > > Joachim, > > Am 27.06.2018 um 07:42 schrieb jtuc...@objektfabrik.de: > > Norbert, > > Am 26.06.18 um 21:41 schrieb Norbert Hartl: > > > > Am 26.06.2018 um 20:44 schrieb Andrei Stebakov <lisper...@gmail.com>: > > What would be an example for load balancer for Pharo images? Can we run > multiple images on the same server or for the sake of balancing > configuration we can only run one image per server? > > There are a lot of possibilities. You can start multiple images on > different ports and use nginx with an upstream rule to load balance. I > would recommend using docker for spawning multiple images on a host. Again > with nginx as frontend load balancer. The point is that you can have at > least twice as muh inages running then you have CPU cores. And of course a > lot more. > > > the last time I checked nginx, the load balancing and sticky session stuff > was not available in the free edition. So I guess you either pay for nginx > (which I think is good) or you know some free 3d party addons... > > there is the upstream module which provides load balancing. But you are > right I think sticky sessions is not part of it. The closest you get IIRC > is IP based hashing. > > I see. > > > I wonder what exactly the benefit of Docker is in that game? On our > servers we run 10 images on 4 cores with HT (8 virtual cores) and very > rareley have real performance problems. We use Glorp, so there is a lot of > SQL queriing going on for quite basic things already. So my guess would be > that your "2 images per core" are conservative and leave air for even a > third one, depending on all the factors already discussed here. > > > Docker is pretty nice. You can have the exact same deployment artefact > started multiple times. I used tools like daemontools, monit, etc. before > but starting the image, assigning ports etc. you have to do yourself which > is cumbersome and I don’t like any of those tools anymore. If you created > your docker image you can start that multiple times because networking is > virtualized all images can have the same port serving e.g. > > > oh, I see. This is a plus. We're not using any containers and have to > provide individual configurations for each image we start up. Works well, > not too many moving parts (our resources are very limited) and we try to > keep things as simple as possible. As long as we can live with providing a > statically sized pool of machines and images and load doesn't vary too > much, this is not too bad. But once you need to dynamically add and remove > images for coping with load peeks and lows, our approach will probably > become cumbersome and complicated. > OTOH, I guess usind Docker just means solving the same problems on another > level - but I guess there are lots of toosl in the Container area that can > help here (like the trafik thing mentioned in another thread). > > > I think talking about performance these days is not easy. Modern machines > are so fast that you need a lot of users before you experience any > problems. > > ... depending on your usage of resources. As I said, we're using SQL > heavily because of the way Glorp works. So it is easy to introduce > bottlenecks even for smaller jobs. > > The mention of „2 images per core“ I need to explain. A CPU core can > execute only one thing at a time. Therefor 1 image per core would be > enough. The second one is for that time slices where there are gaps in > processing meaning the process is suspended, switched etc. It is just the > rule of thumb that it is good to have one process waiting in the scheduling > queue so it can step in as soon as there is free cycles. The „2 images per > core“ have the assumption that you can put an arbitrary load on one image. > So with this assumption a third image won’t give you anything because it > cannot do anything the other two images cannot do. > So according to the „hard“ facts it does not help having more than two > images. On the other hand each image is single threaded and using more > images lowers the probability that processes get blocked because they are > executed within one image. On yet another hand if you use a database a lot > of the time for a process is waiting for the response of the database so > other processes can be executed. And and and…. So in the end you have to > try it. > > > You are correct. The third image con anly jump in if both the others are > in a wait state. It "feels" as if there was enough air for a third one to > operate, but we'd have to try if that holds true. > > > What's not to underestimate is all the stuff around monitoring and > restarting images when things go wrong, but that's another story... > > Docker has a restart policy so restarting shouldn’t be an issue with it. > Monitoring is always hard. I use prometheus with grafana but that is quite > a bit to set up. But in the end you get graphs and you can define alerts > for system value thresholds. > > Well, that is also true for monit (which we use), the question always is: > what do you make of those numbers. We have situations in whcih an Image > responds to http requests as if all were good. But for some reason, DB2 > sometimes takes forever to answer queries, and will probably answer with a > "cannot handle requests at this time" after literally a minute or so. Other > DB connections work well in parallel. We're still looking for ways to > recognize such situations externally (and think about moving from DB2 to > PostgreSQL). > > If the topic gets accepted Marcus and me will tell about these things at > ESUG. > > > So if anybody from the program committee is reading this: Please accept > and schedule Norbert's and Macus' talk, I'll be sticking to their lips and > I guess I won't be alone ;-) > > > Joachim > > > > Norbert > > Joachim > > > > -- > ----------------------------------------------------------------------- > Objektfabrik Joachim Tuchel mailto:jtuc...@objektfabrik.de > <jtuc...@objektfabrik.de> > Fliederweg 1 http://www.objektfabrik.de > D-71640 Ludwigsburg http://joachimtuchel.wordpress.com > Telefon: +49 7141 56 10 86 0 Fax: +49 7141 56 10 86 1 > > > > > -- > ----------------------------------------------------------------------- > Objektfabrik Joachim Tuchel mailto:jtuc...@objektfabrik.de > <jtuc...@objektfabrik.de> > Fliederweg 1 http://www.objektfabrik.de > D-71640 Ludwigsburg http://joachimtuchel.wordpress.com > Telefon: +49 7141 56 10 86 0 Fax: +49 7141 56 10 86 1 > > >