Hi Frits,
On Mar 3, 2008, at 1:37 AM, Frits Jalvingh wrote:
The later. This is not configurable in Cayenne. I strongly suspect
that PreparedStatement caching is never a bottleneck in the case of
update (and in any event, if we issue a set of different UPDATE
queries for each entity, each variation can be cached just as well).
If somebody can prove me wrong with some profiling data, we can
reconsider the update approach.
Ok. It will be a bottleneck though for tasks doing lots of repetitive
work: preparing a statement in Oracle easily takes 100x (!!) as much
time as executing it. And sending lots of different statements to the
DataSource's cache does not work- it caches only some of them to
prevent
open cursor troubles so if you send it too many it does not work.
Finally, sending lots of statements to Oracle means the database's SQL
cache (SGA) which holds the database's cached execution plans will
loose
efficiency. This will add up to quite large costs when the application
is used by many people.
I may actually try it myself one day. The JDBC part of such test
should be fairly easy to recreate. I think this also depends on the
database. As for very large updates within a single transaction, I
don't think the situation is that bad. Cayenne is using JDBC batching
(that happens to actually make most difference on Oracle), and there
is a fixed number of query permutations. Say if you have 1000 Artist
objects, with 500 artists having their name updated, and 500 -
dateOfBirth. So we have only two PreparedStatement batches to execute.
Also I suspect Oracle will behave differently from other DB's.
Anyways, this requires research, and I am glad you brought it up. In
fact I wish that your strategy of a full row update would end up being
faster across the board as this would simplify the update code
significantly. Although with large data sets other factors will play
out, such as network latency, etc.
This means I would not use Cayenne for any kind of work that handles
large amounts of objects; something that is not advisable anyway for
the
current generation of ORM's.
As Ari and Malcolm pointed out, this is too general of a statement.
And as you pointed out, many scenarios can be optimized within the ORM
paradigm ;-)
Crystal clear ;-) This means you have no "previous" state in memory to
restore the objects to pristine state?
Not true. The previously committed state is available. The algorithm
just follows a cheaper path during rollback.
Ok. Just to be clear: would commitChanges() commit all of the changes
including the changes of all parents into the database?
Yes.
You can achieve that with JPA-like lifecycle callbacks (that are
available in Cayenne outside JPA) - POST_UPDATE, POST_PERSIST.
By "refreshing" the objects in such a callback? That might be doable
then; I would like a declarative way better though because I would
need
to code every separate record's generated fields into such a
handler...
Cayenne mapping *API* allows to tag arbitrary columns that are DB-
generated as such, but since the backend only supports PK retrieval
via 'getGeneratedKeys', the Modeler gives an impression that this is
only supported for a PK. So a single generic catch-all callback should
be possible (in fact I'd think that we should do it on Cayenne end,
even if this is solved as UPDATE/SELECT).
Hmm... does it get passed over JDBC back to the client, and if so, do
you have a JDBC example? (I know that even the standard callback via
'Statement.getGeneratedKeys()' still doesn't work with many drivers).
Sure, I used it in a proof-of-concept ORM that I built to check
whether
it was possible to build something more reasonable than Hibernate or
JPA. It does depend on the database, but the ones I use (Oracle,
Postgresql) both support it albeit in slightly different ways.
For Oracle you need to use a CallableStatement instead of a
PreparedStatement and create a statement like:
begin
update xxx set ..... where .... returning [columnname],
[columnname]...;
end;
Before calling CallableStatement.execute() you must call
CallableStatement.registerOutput() with the SQLType of each returned
column.
After calling you can retrieve the values using
CallableStatement.getXXX(). This is very fast for Oracle; way faster
than Hibernate's "update-then-reselect". I also used it to support
native Oracle primary keys using sequences; you can generate inserts
the
same way using the sequence in the values list, like:
begin
insert into xxx(id, a, b...) values(my_sequence.nextval, ?, ?, ...)
returning id;
end
which is again faster than selecting the sequence in a separate
statement.
Postgresql supports something much alike but uses a PreparedStatement,
no begin/end, and to retrieve the data you need to call getResultSet()
on the preparedstatement; the resultset contains the data from the
returning clause. I did not check the performance of this although the
Postgresql documentation leads me to believe it is a single roundtrip.
The sad thing is that this makes database adapter classes rather
complex
because lots of things need to be delegated there to support these
quite
different ways of supporting native sequences and quick returns of
updated data. The advantage is that performance is way better.
All of this, plus using bytecode instead of reflection makes my
proof-of-concept code easily outperform Hibernate. If only completing
(and maintaining!!) it wasn't such a lot of work ;-)....
If you need a concrete example I can cook one up.
I got the idea. May actually be worth exploring.
Andrus