Artur Grabowski wrote:
Geoff Steckel <[EMAIL PROTECTED]> writes:
Any argument to experience must be from similar actual implementations
using "threads" and another model, such as multiple processes with
interprocess communications.
Sure. I'll pick up the challenge.
At work we have a server that uses around 4GB RAM and runs on an 4 cpu
machine. It serves millions of tcp connections per hour. sharing the
memory without sharing pointer values is too inefficient since a big
amount of the memory used is a pre-computed cache of most common query
results. The service needs 4x4GB of RAM on the machine to be able to
reload the data efficiently without hitting disk, since hitting the
disk kills performance in critical moments and leads to
inconsistencies between the four machines that run identical instances
of this service.
Therefore:
- fork would not work because cache would not be shared and this
would lead to too big cache miss ratio.
- adding more RAM won't work because it would spend rack real estate
and power and cooling budget which we can't do.
- adding more machines will not solve the problem for the same reasons
as RAM.
- reducing the data set will not work because we kinda like to make
lots of money, not just a little money.
- partitioning the data does not work good because it causes a too
high cost in performance and memory consumption.
What works is threads. We've had one thread related bug in the past
year.
Art,
It sounds like your application is pretty reasonable. The benefits
of much cash, the restrictions on what hardware can be used, and
your willingness to keep the project under control make a big difference
in the cost-benefit balance.
I can think of one thing that might have made a difference:
it's possible under most unix-style OSs to share memory at a
fixed address
I'm not entirely sure that how much of your database stays in the cache
except possibly some of the root, but I hope you've got the tools
to know that.
Still,
you're pushing the envelope very hard to get as much performance
and you -need- the performance, and even a percent or two of performance
matters
your application is SIMD-like in the large
you've considered the tradeoffs and accept the risk for the benefits
And I infer from what you say:
It sounds like most queries are read-only, so they
do not affect any shared state, therefore locking issues are relatively
few.
It also sounds like the application itself is relatively static
(or at least the query engine is).
The programming team is relatively static due to
large $$$ rewards
I'm assuming that the query engine is well separated in the code
from code which changes due to changes in the data being served
All of this taken together puts this into an area where I'm willing to
agree that threads are an acceptable solution if not a desirable one.
If any of the points above were different (complex state changes,
didn't need 100+%, not read-only, not static code, many hands changing
on the engine code) I'd disagree.
On a very superficial consideration of what you've said,
I suspect I could get a multiprocess solution to come within a few percent
of the threaded one, but you say you need that last few percent. There
are a lot of possible memory architecture issues (4 x 4 GB memory gets
me wondering about its exact physical layout and bus architecture). A form
of pipelined processing might also partition well, but I don't know any
details of what you're doing. Depending very much on the exact situation
offloading the TCP handshaking onto the processors in GBit network cards
---might--- work - there are a lot of possible gotchas but ---if--- the
cards are fast enough and have enough on-card memory, the payoff could be
large.... of course, then the network cards would have all the threads
in them!
Good luck, and thanks for the useful example!
geoff steckel