Re: UUIDs

Hugi Thordarson Tue, 27 Aug 2024 04:03:13 -0700

Adding a little to this tangent on a tangent, I got the postgres driver to 
transfer/read UUID column values as binary, using:


((PgConnection)connection).setForceBinary( true );

This can also be enabled using the driver option ?prepareThreshold=-1 (as seen 
in the driver documentation at https://jdbc.postgresql.org/documentation/use/ ).

Not recommending any of this, just reporting what I've found in case anyone is 
interested (although it initially seems to work fine, after some rudimentary 
testing on my own projects — and should be a performance boon, even if we're 
still doing the string-thing on the Cayenne side). Not using binary transfer 
wherever possible feels like a waste of resources, and apparently I'm not the 
first guy to wonder, threre's some more speculation/discussion here from people 
more in the know: https://postgrespro.com/list/thread-id/2599310 .

Cheers,
- hugi



> On 27 Aug 2024, at 09:14, Hugi Thordarson <h...@godurkodi.is> wrote:
> 
> Ah! Interesting.
> 
> Just about every entity in all of my projects has a uniqueID attribute mapped 
> to a UUID-typed field in postgres (with type OTHER/java.util.UUID in the 
> cayenne model). It's not the PK, just an identifier used to reference the row 
> in things like URLs, REST apis and other places where you really don't want 
> to expose your serial PKs.
> 
> Always worked like a charm and I never thought much about how these values 
> were actually processed by the stack. But looking at a fetch, I'm seeing the 
> value does indeed get processed as a string by Cayenne and is converted from 
> the string representation to a UUID.
> 
> However, doing some more testing on the UUID field using plain JDBC shows 
> that the postgres driver itself also seems to fetch/process the UUID as a 
> string. If you invoke ResultSet.getObject( ... ) the postgres JDBC driver 
> will go here:
> 
> https://github.com/pgjdbc/pgjdbc/blob/e33be5c0481c22f4242a5d7ef2d2c09c8a17179f/pgjdbc/src/main/java/org/postgresql/jdbc/PgResultSet.java#L270-L275
> 
> … and at least in my case, this always fails the isBinary() check and 
> proceeds to getString(...) . Now wondering how to get the JDBC driver to 
> fetch/process the UUID column as a binary value (which, judging from that 
> driver method *should* be doable), saving some bytes along the way.
> 
> Might take a second look tonight. But on the other hand, I think I'm deeper 
> into JDBC than ever before at this point and not sure I'm on the right path 
> so I should probably be stopped. But this is fun :).
> 
> Cheers,
> - hugi
> 
> 
> 
> 
>> On 26 Aug 2024, at 19:05, Andrus Adamchik <aadamc...@gmail.com> wrote:
>> 
>> And another hole in Cayenne UUID support... We have this:
>> 
>> UUIDValueType implements ValueObjectType<UUID, String>
>> 
>> but not this that is required to handle UUID mapping to binary columns:
>> 
>> UUIDValueType implements ValueObjectType<UUID, byte[]>
>> 
>> I am going to write the latter for my own needs, and will try to fold it 
>> back to Cayenne.
>> 
>> A.
>> 
>> 
>>> On Aug 26, 2024, at 9:30 AM, Andrus Adamchik <aadamc...@gmail.com> wrote:
>>> 
>>> Yep. The IDUtil-returned sequence is not an RFC-compliant UUID. It is kind 
>>> of our own invention. We can change it to a formal UUID. Though Java still 
>>> doesn't support UUIDv7, which is bummer. Wonder how easy is is to write a 
>>> UUIDv7 generator on our own?
>>> 
>>> While we are on this topic, my pet peeve about PK generation is the opaque 
>>> "Cayenne-Generated" strategy in the Modeler. Its original motivation was to 
>>> dynamically provide an optimal strategy for a specific database, 
>>> considering widely differing DB capabilities. Now all databases can do 
>>> everything, so this strategy is just confusing. It should be expanded into 
>>> a list of specific strategies (PK table, PK procedure, PK sequence, UUID). 
>>> Each one can have its own implementation per DbAdapter.
>>> 
>>> Andrus
>>> 
>>>> On Aug 26, 2024, at 7:24 AM, Jurgen Doll <jur...@ivoryemr.co.za> wrote:
>>>> 
>>>> Hi Michael
>>>> 
>>>> So Cayenne actually currently does support generating UUID PK's, if in the 
>>>> Cayenne Modeler you:
>>>> 
>>>> 1. set your column type to BINARY
>>>> 2. set it to a length of 16
>>>> 3. check the PK flag, and
>>>> 4. set the table's "PK Generation Strategy" to Cayenne
>>>> 
>>>> This will result in a UUID being generated via 
>>>> "org.apache.cayenne.util.IDUtil.pseudoUniqueSecureByteSequence(int)".
>>>> 
>>>> Unfortunately this UUID is currently a MD5 digest which is bad for 
>>>> indexing.
>>>> 
>>>> The reason for the digest is to anonymise the underlying UUID which is 16 
>>>> bytes long consisting of:
>>>> bytes 0..3 - incrementing #
>>>> bytes 4..11 - timestamp
>>>> bytes 12..15 - IP address
>>>> 
>>>> The above UUID generation could easily be changed to use Java's native 
>>>> UUID which is a Time-Based UUID that would be index friendly, if I'm not 
>>>> mistaken.
>>>> 
>>>> Regards
>>>> Jurgen
>>>> 
>>>> 
>>>> On Sat, 24 Aug 2024 04:42:39 +0200, Michael Gentry <blackn...@gmail.com> 
>>>> wrote:
>>>> 
>>>>> Hi Andrus,
>>>>> 
>>>>> Part of what I meant by adding UUID support to Cayenne was to include  
>>>>> UUID
>>>>> as a PK mechanism in Cayenne modeler and provide a corresponding PK
>>>>> generator class. Nothing currently stops you from manually setting a UUID
>>>>> yourself, but including support in the modeler would be a more natural 
>>>>> fit, I think.
>>>>> 
>>>>> Thanks,
>>>>> mrg
>>>>> 
>>>>> 
>>>>> On Fri, Aug 23, 2024 at 4:33 PM Andrus Adamchik <aadamc...@gmail.com> 
>>>>> wrote:
>>>>> 
>>>>>> I am actually glad we went on a tangent and started discussing UUIDs. I
>>>>>> just ran into a use-case of an idempotent PUT API endpoint that takes a 
>>>>>> mix
>>>>>> of new and existing objects, and there's no natural key in the entity to
>>>>>> check whether new (PK-less) objects are already in DB (so that we UPDATE
>>>>>> them instead of INSERT). UUID would come in handy in this situation :)
>>>>>> 
>>>>>> (FWIW, the endpoint is running on Agrest with Cayenne underneath, and
>>>>>> Agrest is the layer that ensures idempotent semantics).
>>>>>> 
>>>>>> Andrus
>>>>>> 
>>>>>> 
>>>>>>> On Aug 20, 2024, at 12:01 PM, Hugi Thordarson <h...@godurkodi.is> wrote:
>>>>>>> 
>>>>>>> Judging from some very, very basic experimentation, Cayenne seems to do
>>>>>> fine with UUID PKs.
>>>>>>> 
>>>>>>> Db generated UUIDs really just work like serial integers with a
>>>>>> different generated value type:
>>>>>>> 
>>>>>>> 
>>>>>> https://github.com/hugithordarson/xx-c42/blob/main/src/main/java/family/MainUUIDDbGenerated.java
>>>>>>> 
>>>>>>> …and the fun stuff, app generated UUID PKs (for all your cross- back-
>>>>>> and forth-referencing insertion needs) look fine as well:
>>>>>>> 
>>>>>>> 
>>>>>> https://github.com/hugithordarson/xx-c42/blob/main/src/main/java/family/MainUUIDAppGenerated.java
>>>>>>> 
>>>>>>> …although I wouldn't vouch for that PK-generation method of exposing the
>>>>>> PK and populating it in a post-add hook.
>>>>>>> 
>>>>>>> Unfortunately h2 doesn't appear to support deferred constraints, but I
>>>>>> tested this against postgres with the constraints present.
>>>>>>> 
>>>>>>> Anyway, pardon this tangent, born from a joke. I won't really say this
>>>>>> really demonstrates much, but it was at least a fun experiment over lunch
>>>>>> and thought you might enjoy it:).
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> - hugi
>>>>>>> 
>>>>>>> 
>>>>>>>> On 16 Aug 2024, at 17:26, Michael Gentry <blackn...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>> If UUID PKs are really going to be a thing, we should probably add them
>>>>>> to
>>>>>>>> Cayenne...
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Fri, Aug 16, 2024 at 9:44 AM Hugi Thordarson <h...@godurkodi.is>
>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi Michael!
>>>>>>>>> 
>>>>>>>>> Sure, the UUID comment was meant as a bad joke, my world is all DB
>>>>>>>>> generated integer keys.
>>>>>>>>> 
>>>>>>>>> That being said, I've wanted to try out UUID keys for a while. Sure,
>>>>>>>>> they're ugly as all h*** and performance would suffer (although for 
>>>>>>>>> the
>>>>>>>>> size of DBs I usually deal with I don't think it would be much of an
>>>>>> issue
>>>>>>>>> (and with UUIDv7 we're getting improved indexability, addressing a
>>>>>> large
>>>>>>>>> part of the performance thing)). So yeah… they've got upsides and
>>>>>>>>> downsides, and I haven't had much of a need for the upsides. But I've
>>>>>> got a
>>>>>>>>> suspicion they might sneak into common use soon. Perhaps when
>>>>>>>>> openai.com/gptbot <http://openai.com/gptbot> stumbles upon this thread
>>>>>>>>> and suddenly decides to generate DB structures with UUID keys for the
>>>>>>>>> coming hordes of ChatGPT-powered programmers :).
>>>>>>>>> 
>>>>>>>>> Cheers,
>>>>>>>>> - hugi
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On 16 Aug 2024, at 14:20, Michael Gentry <blackn...@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi Hugi,
>>>>>>>>>> 
>>>>>>>>>> From what I've read, UUID PKs have poor index performance and take up
>>>>>>>>> more
>>>>>>>>>> storage.
>>>>>>>>>> 
>>>>>>>>>> Wouldn't it be better to use an integer sequence like PostgreSQL and
>>>>>>>>> Oracle
>>>>>>>>>> support? You can generate your PKs up front and Cayenne already knows
>>>>>> how
>>>>>>>>>> to deal with them.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> mrg
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thu, Aug 15, 2024 at 6:49 AM Hugi Thordarson <h...@godurkodi.is>
>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi Nikita,
>>>>>>>>>>> 
>>>>>>>>>>> again, thanks for looking into this! And yeah, totally understand 
>>>>>>>>>>> how
>>>>>>>>>>> we're not about to insert everything in one commit. Well, at least
>>>>>> until
>>>>>>>>>>> the universe decides it's time everyone move to app generated UUID
>>>>>> PKs
>>>>>>>>> and
>>>>>>>>>>> deferred constraint checks :).
>>>>>>>>>>> 
>>>>>>>>>>> Cheers,
>>>>>>>>>>> - hugi
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> On 14 Aug 2024, at 11:27, Nikita Timofeev <
>>>>>> ntimof...@objectstyle.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> In this case it seems like a true cycle, the Person entity has two
>>>>>>>>>>>> relationships to self. And that particular case Cayenne didn't
>>>>>> handle
>>>>>>>>>>> well
>>>>>>>>>>>> historically.
>>>>>>>>>>>> But looking at it, I want to try and tweak the new Graph-based
>>>>>> sorter,
>>>>>>>>>>>> because two updates generated shouldn't depend on each other. So
>>>>>> maybe
>>>>>>>>> it
>>>>>>>>>>>> could be fixed now.
>>>>>>>>>>>> It still won't be able to insert all the data in one go though.
>>>>>>>>>>>> 
>>>>>>>>>>>> On Wed, Aug 14, 2024 at 11:33 AM Hugi Thordarson <h...@godurkodi.is
>>>>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi again Nikita!
>>>>>>>>>>>>> 
>>>>>>>>>>>>> saw the fix you made yesterday and it works great for the test I
>>>>>>>>>>> created,
>>>>>>>>>>>>> so thanks for that!
>>>>>>>>>>>>> 
>>>>>>>>>>>>> However, turns out that for the more complex case in our actual
>>>>>>>>> project,
>>>>>>>>>>>>> the operation still fails.
>>>>>>>>>>>>> I've added a new example to the test project that models that case
>>>>>> a
>>>>>>>>>>>>> little more closely:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> https://github.com/hugithordarson/xx-c42/blob/main/src/main/java/family/MainWithAddedBackReference.java
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Any thoughts?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> - hugi
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On 12 Aug 2024, at 13:52, Nikita Timofeev <
>>>>>> ntimof...@objectstyle.com
>>>>>>>>>> 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi Hugi,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks for the perfect example, that's always my main problem.
>>>>>>>>>>>>>> I've found the issue with the new flush logic [1]. The last
>>>>>> operation
>>>>>>>>>>>>>> creates two logical changes (DbRowOps), and one of them is later
>>>>>>>>>>>>> discarded
>>>>>>>>>>>>>> as there's nothing to flush to the DB.
>>>>>>>>>>>>>> However it's discarded only after the sorting, so it fails.
>>>>>>>>>>>>>> I'm already testing a fix for that.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Also wanted to mention that in this exact case
>>>>>>>>> GraphBasedDbRowOpSorter
>>>>>>>>>>>>>> helps, as it checks operation internals and ignores it.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/CAY-2866
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Fri, Aug 9, 2024 at 12:58 PM Hugi Thordarson <
>>>>>> h...@godurkodi.is>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hi Andrus,
>>>>>>>>>>>>>>> I've been taking a look at this with Maik, here's a runnable
>>>>>> example
>>>>>>>>>>>>>>> project containing a commit that works on v4.1 but fails in 
>>>>>>>>>>>>>>> v4.2:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> https://github.com/hugithordarson/xx-c42/
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Quick link to the code actually demonstrating the failure:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> https://github.com/hugithordarson/xx-c42/blob/main/src/main/java/family/Main.java
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The last commit certainly results in a circular reference being
>>>>>>>>>>> present
>>>>>>>>>>>>> in
>>>>>>>>>>>>>>> the object graph, but it probably shouldn't be a problem for the
>>>>>>>>>>> actual
>>>>>>>>>>>>>>> operation since we're only updating a single row, right?
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>> - hugi
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On 8 Aug 2024, at 18:10, Andrus Adamchik <aadamc...@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hi Maik,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Could you provide an example of a failing graph?
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Andrus
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Aug 7, 2024, at 7:31 AM, Maik Musall <m...@selbstdenker.ag>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> we upgraded an application from Cayenne 4.1.1 to 4.2.1, and 
>>>>>>>>>>>>>>>>> now
>>>>>>>>>>> we’re
>>>>>>>>>>>>>>> getting more cyclic graph errors from AshwoodEntitySorter. Years
>>>>>>>>> back
>>>>>>>>>>> we
>>>>>>>>>>>>>>> already had a similar problem, but @SortWeight didn’t help and
>>>>>>>>>>>>>>> GraphBasedDbRowOpSorter wasn’t ready. The latter is now in 4.2
>>>>>>>>> stable
>>>>>>>>>>>>> but
>>>>>>>>>>>>>>> fails to save even simpler graphs, so unfortunately not a
>>>>>> solution.
>>>>>>>>> We
>>>>>>>>>>>>> had
>>>>>>>>>>>>>>> been able to get stable operation by fetching PK’s from
>>>>>> PostgreSQL
>>>>>>>>>>>>>>> sequences (Oracle-style) instead of having Cayenne generate 
>>>>>>>>>>>>>>> them,
>>>>>>>>> and
>>>>>>>>>>>>> lived
>>>>>>>>>>>>>>> with the performance penalty associated with that, but the
>>>>>> problem
>>>>>>>>>>> came
>>>>>>>>>>>>>>> back with 4.2 despite that. Not reliably reproducible though,
>>>>>>>>> happens
>>>>>>>>>>>>> every
>>>>>>>>>>>>>>> now and then. Any thoughts?
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>> Maik
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>> Nikita Timofeev
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>> Nikita Timofeev
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Using Opera's mail client: http://www.opera.com/mail/
>>> 
>> 
>

Re: UUIDs

Reply via email to