Re: Big list'o questions.....

Andrus Adamchik Sun, 02 Mar 2008 23:44:12 -0800

Hi Frits,

On Mar 3, 2008, at 1:37 AM, Frits Jalvingh wrote:

The later. This is not configurable in Cayenne. I strongly suspect
that PreparedStatement caching is never a bottleneck in the case of
update (and in any event, if we issue a set of different UPDATE
queries for each entity, each variation can be cached just as well).
If somebody can prove me wrong with some profiling data, we can
reconsider the update approach.

Ok. It will be a bottleneck though for tasks doing lots of repetitive
work: preparing a statement in Oracle easily takes 100x (!!) as much
time as executing it. And sending lots of different statements to the

DataSource's cache does not work- it caches only some of them toprevent

open cursor troubles so if you send it too many it does not work.
Finally, sending lots of statements to Oracle means the database's SQL

cache (SGA) which holds the database's cached execution plans willloose

efficiency. This will add up to quite large costs when the application
is used by many people.

I may actually try it myself one day. The JDBC part of such testshould be fairly easy to recreate. I think this also depends on thedatabase. As for very large updates within a single transaction, Idon't think the situation is that bad. Cayenne is using JDBC batching(that happens to actually make most difference on Oracle), and thereis a fixed number of query permutations. Say if you have 1000 Artistobjects, with 500 artists having their name updated, and 500 -dateOfBirth. So we have only two PreparedStatement batches to execute.Also I suspect Oracle will behave differently from other DB's.

Anyways, this requires research, and I am glad you brought it up. Infact I wish that your strategy of a full row update would end up beingfaster across the board as this would simplify the update codesignificantly. Although with large data sets other factors will playout, such as network latency, etc.

This means I would not use Cayenne for any kind of work that handles
large amounts of objects; something that is not advisable anyway forthe
current generation of ORM's.

As Ari and Malcolm pointed out, this is too general of a statement.And as you pointed out, many scenarios can be optimized within the ORMparadigm ;-)

Crystal clear ;-) This means you have no "previous" state in memory to
restore the objects to pristine state?

Not true. The previously committed state is available. The algorithmjust follows a cheaper path during rollback.

Ok. Just to be clear: would commitChanges() commit all of the changes
including the changes of all parents into the database?


Yes.

You can achieve that with JPA-like lifecycle callbacks (that are
available in Cayenne outside JPA) - POST_UPDATE, POST_PERSIST.
By "refreshing" the objects in such a callback? That might be doable
then; I would like a declarative way better though because I wouldneedto code every separate record's generated fields into such ahandler...

Cayenne mapping *API* allows to tag arbitrary columns that are DB-generated as such, but since the backend only supports PK retrievalvia 'getGeneratedKeys', the Modeler gives an impression that this isonly supported for a PK. So a single generic catch-all callback shouldbe possible (in fact I'd think that we should do it on Cayenne end,even if this is solved as UPDATE/SELECT).

Hmm... does it get passed over JDBC back to the client, and if so, do
you have a JDBC example? (I know that even the standard callback via
'Statement.getGeneratedKeys()' still doesn't work with many drivers).

Sure, I used it in a proof-of-concept ORM that I built to checkwhether

it was possible to build something more reasonable than Hibernate or
JPA. It does depend on the database, but the ones I use (Oracle,
Postgresql) both support it albeit in slightly different ways.

For Oracle you need to use a CallableStatement instead of a
PreparedStatement and create a statement like:

begin
 update xxx set ..... where .... returning [columnname],
[columnname]...;
end;

Before calling CallableStatement.execute() you must call
CallableStatement.registerOutput() with the SQLType of each returned
column.

After calling you can retrieve the values using
CallableStatement.getXXX(). This is very fast for Oracle; way faster
than Hibernate's "update-then-reselect". I also used it to support

native Oracle primary keys using sequences; you can generate insertsthe

same way using the sequence in the values list, like:

begin
 insert into xxx(id, a, b...) values(my_sequence.nextval, ?, ?, ...)
   returning id;
end

which is again faster than selecting the sequence in a separate
statement.

Postgresql supports something much alike but uses a PreparedStatement,
no begin/end, and to retrieve the data you need to call getResultSet()
on the preparedstatement; the resultset contains the data from the
returning clause. I did not check the performance of this although the
Postgresql documentation leads me to believe it is a single roundtrip.

The sad thing is that this makes database adapter classes rathercomplexbecause lots of things need to be delegated there to support thesequite

different ways of supporting native sequences and quick returns of
updated data. The advantage is that performance is way better.

All of this, plus using bytecode instead of reflection makes my
proof-of-concept code easily outperform Hibernate. If only completing
(and maintaining!!) it wasn't such a lot of work ;-)....

If you need a concrete example I can cook one up.


I got the idea. May actually be worth exploring.

Andrus

Re: Big list'o questions.....

Reply via email to