Re: [HACKERS] Shared memory

Dave Cramer Tue, 28 Mar 2006 10:26:23 -0800


On 28-Mar-06, at 10:48 AM, Thomas Hallgren wrote:

Hi Simon,
Thanks for your input. All good points. I actually did some workusing Java stored procedures on DB2 a while back but I had managedto forget (or repress :-) ) all about the FENCED/NOT FENCED stuff.The current discussion definitely puts it in a differentperspective. I think PL/Java has a pretty good 'NOT FENCED'implementation, as does many other PL's, but no PL has yet come upwith a FENCED solution.

What exactly is a FENCED solution ? If it is simply a remoteconnection to a single JVM then pl-j already does that.

This FENCED/NOT FENCED terminology would be a good way todifferentiate between the two approaches. Any chance of that syntaxmaking it into the PostgreSQL grammar, should the need arise?
Some more comments inline:

Simon Riggs wrote:
Just some thoughts from afar: DB2 supports in-process and out-of-processexternal function calls (UDFs) that it refers to as UNFENCED andFENCED
procedures. For Java only, IBM have moved to supporting *only* FENCED
procedures for Java functions, i.e. having a single JVM for all
connections.
>
Are you sure about this? As I recall it a FENCED stored procedureexecuted in a remote JVM of it's own. A parameter could be usedthat either caused a new JVM to be instantiated for each storedprocedure call or to be kept for the duration of the session. Theformer would yield really horrible performance but keep memoryutilization at a minimum. The latter would get a more acceptableperformance but waste more memory (in par with PL/Java today).
Each connection's Java function runs as a thread on a
single dedicated JVM-only process.
If that was true, then different threads could share dirty sessiondata. I wanted to do that using DB2 but found it impossible. Thatwas a while back though.
That approach definitely does increase the invocation time, but it
significantly reduces the resources associated with the JVM, aswell as
allowing memory management to be more controllable (bliss...). So the
overall picture could be more CPU and memory resources for each
connection in the connection pool.
My very crude measurements indicate that the overhead of using aseparate JVM is between 6-15MB of real memory per connection.Today, you get about 10MB/$ and servers configured with 4GB RAM ormore are not uncommon.
I'm not saying that the overhead doesn't matter. Of course it does.But the time when you needed to be extremely conservative withmemory usage has passed. It might be far less expensive to buy someextra memory then to invest in SMP architectures to minimize IPCoverhead.
My point is, even fairly large app-servers (using connection poolswith up to 200 simultaneous connections) can run using relativelyinexpensive boxes such as an AMD64 based server with 4GB RAM andshow very good throughput with the current implementation.
If you have a few small Java functions centralisation would not begood,but if you have a whole application architecture with manyconnections
executing reasonable chunks of code then this can be a win.
One thing to remembered is that a 'chunk of code' that executes ina remote JVM and uses JDBC will be hit by the IPC overhead on eachinteraction over the JDBC connection. I.e. the overhead is not justlimited to the actual call of the UDF, it's also imposed on alldatabase accesses that the UDF makes in turn.
In that environment we used Java for major database functions,with SQL
functions for small extensions.
My guess is that those major database functions did a fair amountof JDBC. Am I right?
Also the Java invocation time we should be celebrating is that byhavingJava in the database the Java<->DB time is much less than it wouldbe if
we had a Java stack sitting on another server.
I think the cases when you have a Tomcat or JBoss sitting on thesame physical server as the actual database are very common. Onemajor reason being that you don't want network overhead between themiddle tier and the backend. Moving logic into the database insteadof keeping it in the middle tier is often done to get rid of thelast hurdle, the overhead of IPC.
Regards,
Thomas Hallgren
---------------------------(end ofbroadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq



---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Re: [HACKERS] Shared memory

Reply via email to