Joachim

seriously, I think 20 is very optimistic for several reasons. (...)


Whoa! Thanks for the careful and insightful response, I really appreciate
that! :)

On Thu, Dec 15, 2016 at 12:00 PM, Sven Van Caekenberghe <s...@stfx.eu>
wrote:

> Joachim,
>
> > On 15 Dec 2016, at 11:43, jtuc...@objektfabrik.de wrote:
> >
> > Victor,
> >
> > Am 14.12.16 um 19:23 schrieb Vitor Medina Cruz:
> >> If I tell you that my current estimate is that a Smalltalk image with
> Seaside will not be able to handle more than 20 concurrent users, in many
> cases even less.
> >>
> >> Seriously? That is kinda a low number, I would expect more for each
> image. Certainly it depends much on many things, but it is certainly very
> low for a rough estimate, why you say that?
> >
> > seriously, I think 20 is very optimistic for several reasons.
> >
> > One, you want to be fast and responsive for every single user, so there
> is absolutely no point in going too close to any limit. It's easy to lose
> users by providing bad experience.
> >
> > Second, in a CRUD Application, you mostly work a lot with DB queries.
> And you connect to all kinds of stuff and do I/O. Some of these things
> simply block the VM. Even if that is only for 0.3 seconds, you postpone
> processing for each "unaffected" user by these 0.3 seconds, so this adds to
> significant delays in response time. And if you do some heavy db
> operations, 0.3 seconds is not a terribly bad estimate. Add to that the
> materialization and stuff within the Smalltalk image.
> >
> > Seaside adapters usually start off green threads for each request. But
> there are things that need to be serialized (like in a critical Block). So
> in reality, users block each other way more often than you'd like.
> >
> > So if you asked me to give a more realistic estimation, I'd correct
> myself down to a number between 5 and probably a maximum of 10 users.
> Everything else means you must use all those fancy tricks and tools people
> mention in this thread.
> > So what you absolutely need to do is start with an estimate of 5
> concurrent users per image and look for ways to distribute work among
> servers/images so that these blocking situations are down to a minimum. If
> you find your software works much better, congratulate yourself and stack
> up new machines more slowly than initially estimated.
> >
> >
> > Before you turn around and say: Smalltalk is unsuitable for the web,
> let's take a brief look at what concurrent users really means. Concurrent
> users are users that request some processing from the server at they very
> same time (maybe within an interval of 200-400msec). This is not the same
> as 5 people being currently logged on to the server and requesting
> something sometimes. 5 concurrent users can be 20, 50, 100 users who are
> logged in at the same time.
> >
> > Then there is this sad "share all vs. share nothing" argument. In
> Seaside you keep all your objects alive (read from db and materialized)
> between web requests. IN share nothing, you read everything back from
> disc/db whenever a request comes in. This also takes time and ressources
> (and pssibly blocks the server for the blink of an eye or two). You
> exchange RAM with CPU cycles and I/O. It is extremely hard to predict what
> works better, and I guess nobody ever made A/B tests. It's all just
> theoretical bla bla and guesses of what definitely must be better in one's
> world.
> >
> > Why do I come up with this share everything stuff? Because it usually
> means that each user that is logged on holds onto a load of objects on the
> server side (session storage), like their user account, shopping card,
> settings, last purchases, account information and whatnot. That's easily a
> list of a few thousand objects (and be it only Proxies) that take up space
> and want to be inspected by the garbage collector. So each connected user
> not only needs CPU cycles whenever they send a request to the server, but
> also uses RAM. In our case, this can easily be 5-10 MB of objects per user.
> Add to that the shadow copies that your persistence mechanism needs for
> undo and stuff, and all the data Seaside needs for Continuations etc, and
> each logged on users needs 15, 20 or more MB of object space. Connect ten
> users and you have 150-200 MB. That is not a problem per se, but also means
> there is some hard limit, especially in a 32 bit world. You don't want your
> server to slow down because it cannot allocate new memory or can't find
> contiguous slots for stuff and GCs all the time.
> >
> > To sum up, I think the number of influencing factors is way too high to
> really give a good estimate. Our experience (based on our mix of
> computation and I/O) says that 5 concurrent users per image is doable
> without negative impact on other users. Some operations take so much time
> that you really need to move them out of the front-facing image and
> distribute work to backend servers. More than 5 is probably possible but
> chances are that there are operations that will affect all users and with
> every additional user there is a growing chance that you have 2 or more
> requesting the yery same operation within a very short interval. This will
> make things worse and worse.
> >
> > So I trust in you guys having lots of cool tools around and knowing
> loads of tricks to wrench out much more power of a single Smalltalk image,
> but you also need to take a look at your productivity and speed in creating
> new features and fixing bugs. Sometimes throwing hardware at a problem like
> growth and starting with a clever architecture to scale on multiple layers
> is just the perfect thing to do. To me, handling 7 instead of 5 concurrent
> users is not such a big win as long as we are not in a posotion where we
> have so many users that this really matters. For sites like Amazon, Google,
> Facebook etc. saving 40% in server cost by optimizing the software
> (investing a few man years) is significant. I hope we'll soon change our
> mind about this question ;-)
> >
> > So load balancing and services outsourced to backend servers are key to
> scalability. This, btw, is not smalltalk specific (some people seem to
> think you won't get these problems in Java or Ruby because they are made
> for the web...).
> >
> > Joachim
>
> Everything you say, all your considerations, especially the last paragraph
> is/are correct and I agree.
>
> But some people will only remember the very low number you seem to be
> suggesting (which is more of a worse case scenario, with
> Seaside+blocking/slow connections to back end systems).
>
> One the other hand, plain HTTP access to a Pharo image can be quite fast.
> Here is quick & dirty benchmark I just did on one of our modern/big
> machines (inside an LXD container, light load) using a single stock image
> on Linux.
>
>
> $ pharo Pharo.image printVersion
> [version] 4.0 #40626
>
> $ pharo Pharo.image eval 'ZnServer startDefaultOn: 1701. 1 hour wait' &
>
> $ ab -k -c 8 -n 10240 http://127.0.0.1:1701/bytes/32
> This is ApacheBench, Version 2.3 <$Revision: 1638069 $>
> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
> Licensed to The Apache Software Foundation, http://www.apache.org/
>
> Benchmarking 127.0.0.1 (be patient)
> Completed 1024 requests
> Completed 2048 requests
> Completed 3072 requests
> Completed 4096 requests
> Completed 5120 requests
> Completed 6144 requests
> Completed 7168 requests
> Completed 8192 requests
> Completed 9216 requests
> Completed 10240 requests
> Finished 10240 requests
>
>
> Server Software:        Zinc
> Server Hostname:        127.0.0.1
> Server Port:            1701
>
> Document Path:          /bytes/32
> Document Length:        32 bytes
>
> Concurrency Level:      8
> Time taken for tests:   1.945 seconds
> Complete requests:      10240
> Failed requests:        0
> Keep-Alive requests:    10240
> Total transferred:      2109440 bytes
> HTML transferred:       327680 bytes
> Requests per second:    5265.17 [#/sec] (mean)
> Time per request:       1.519 [ms] (mean)
> Time per request:       0.190 [ms] (mean, across all concurrent requests)
> Transfer rate:          1059.20 [Kbytes/sec] received
>
> Connection Times (ms)
>               min  mean[+/-sd] median   max
> Connect:        0    0   0.0      0       2
> Processing:     0    2   8.0      2     309
> Waiting:        0    1   8.0      1     309
> Total:          0    2   8.0      2     309
>
> Percentage of the requests served within a certain time (ms)
>   50%      2
>   66%      2
>   75%      2
>   80%      2
>   90%      2
>   95%      3
>   98%      3
>   99%      3
>  100%    309 (longest request)
>
>
> More than 5K req/s (10K requests, 8 concurrent clients).
>
> Granted, this is only for just 32 bytes payload and the loopback network
> interface. But this is the other end of the interval, the maximum speed.
>
> A more realistic payload (7K HTML) gives the following:
>
>
> $ ab -k -c 8 -n 10240 http://127.0.0.1:1701/dw-bench
> This is ApacheBench, Version 2.3 <$Revision: 1638069 $>
> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
> Licensed to The Apache Software Foundation, http://www.apache.org/
>
> Benchmarking 127.0.0.1 (be patient)
> Completed 1024 requests
> Completed 2048 requests
> Completed 3072 requests
> Completed 4096 requests
> Completed 5120 requests
> Completed 6144 requests
> Completed 7168 requests
> Completed 8192 requests
> Completed 9216 requests
> Completed 10240 requests
> Finished 10240 requests
>
>
> Server Software:        Zinc
> Server Hostname:        127.0.0.1
> Server Port:            1701
>
> Document Path:          /dw-bench
> Document Length:        7734 bytes
>
> Concurrency Level:      8
> Time taken for tests:   7.874 seconds
> Complete requests:      10240
> Failed requests:        0
> Keep-Alive requests:    10240
> Total transferred:      80988160 bytes
> HTML transferred:       79196160 bytes
> Requests per second:    1300.46 [#/sec] (mean)
> Time per request:       6.152 [ms] (mean)
> Time per request:       0.769 [ms] (mean, across all concurrent requests)
> Transfer rate:          10044.25 [Kbytes/sec] received
>
> Connection Times (ms)
>               min  mean[+/-sd] median   max
> Connect:        0    0   0.0      0       0
> Processing:     1    6 183.4      1    7874
> Waiting:        1    6 183.4      1    7874
> Total:          1    6 183.4      1    7874
>
> Percentage of the requests served within a certain time (ms)
>   50%      1
>   66%      1
>   75%      1
>   80%      1
>   90%      1
>   95%      1
>   98%      1
>   99%      1
>  100%   7874 (longest request)
>
>
> That is more than 1K req/s.
>
> In both cases we are talking about sub 1ms req/resp cycles !
>
> I think all commercial users of Pharo today know what is possible and what
> needs to be done to achieve their goals. Pure speed might not be the main
> consideration, ease/speed/joy of development and just being capable of
> solving complex problems and offering compelling solutions to end users is
> probably more important.
>
> Sven
>
>
>
>

Reply via email to