On 17-Aug-05, at 12:40 PM, Thomas Hallgren wrote:
Andrew Dunstan wrote:
Dave Cramer wrote:
As there are two java procedural languages which are available
for postgreSQL Josh asked for an explanation as to their
differences.
They are quite similar in that both of them run the function in
a java vm, and are pre-compiled. Neither attempt to compile the
code.
The biggest difference is how they connect to the java VM.
PL/Java uses Java Native Interfaces (JNI) and does a direct call
into the java VM from the language handler.
PL-J uses a network protocol to connect to a java VM.
There are advantages and disadvantages to both approaches.
+ JNI is simpler, doesn't require a protocol, or an application
container to manage the User Defined Functions
- JNI requires that the vm runs on the server machine, and a
separate vm be instantiated for every connection that calls a
function.
This is mitigated somewhat in java 1.5, by sharing data,
however this may or may not be a Sun only feature ( does anyone
know );
either way a separate vm is required for each connection.
- startup time for the vm on the first call for the connection.
- Possible ( not as likely any more ) for the java VM to take
the server down.
Using a network protocol such as a pl-j does has the following
( basically the opposite of the JNI (dis)advantages )
+ The java VM does not have to run on the server.
+ Only one vm per server
- More complex, requires a micro kernel application server to
manage the UDF's currently http://loom.codehaus.org/
I think Dave miss a couple of important points.
1. Speed. One major reason for moving code from the middle tier
down to the database is that you want to execute the code close to
the actual persistence mechanisms in order to minimize network
traffic and maximize throughput.
I think until there are actual benchmarks, there are too many
variables here to suggest one is faster than the other. The overhead
of having multiple java vm's is not easily estimated. Even with a
connection pool, consider the memory footprint of even 10 java VM's
2. A growing percentage of db-clients utilize some kind of
connection pool (an overwelming amount of the java clients certanly
do), which minimizes the problem with startup times.
3. Transaction visiblity. A function that in turn issues new SQL
calls must do that wihtin the scope of the caller transaction. A
remote process must hence call back into it's caller. PL/Java has
its own JDBC driver that interacts directly with SPI.
PL-J maintains transaction visibility, it has it's own JDBC driver as
well. The protocol between the language handler and the java portion
is based upon the FE/BE protocol which made it easy to use pg's JDBC
driver with some modification.
4. Isolation. Using separate VM's, instabilities in the VM can only
affect one single connecton. One VM can be debugged or monitored
without affecting the others. No data can be inadvertidely moved
between connections, etc.
Loom deals with data integrity, debugging would have to be done by a
remote debug connection and can connect to any thread.
I try to shed more light on the pros and cons here: http://
gborg.postgresql.org/project/pljava/genpage.php?jni_rationale
That's a pretty good explanation and ought to be published more
widely. It's almost a pity that we couldn't have one project with
a server setting saying how we want it to run.
There are a couple of reasons that make me a bit reluctant to join
the projects:
PL/Java have no dependencies at all besides a Java Runtime
Environment (or GCJ). PL/J reqires a fair amount of other modules
just to compile.
PL-J requires one other module, which the build environment will
fetch automatically to compile.
PL/Java is at release 1.1 and have a community of users. To my
knowledge, PL/J has not reached its first release yet.
PL/Java and PL/J use completely different approaches and share
almost no code. The code that we do share (public interfaces, manly
for trigger management) is published at the Maven repository at
ibiblio.org.
I think it's better to keep the two projects separate. But I also
think that it is extremely important that we ensure that the user
experience is similar for both projects so that there's nothing to
prevent a server setting that decides which one to use provided
both are present.
Kind regards,
Thomas Hallgren
---------------------------(end of
broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly