Re: [hibernate-dev] [OGM] storing the column names in the entity keys for K/V stores

Emmanuel Bernard Wed, 26 Nov 2014 08:30:17 -0800

> On 26 Nov 2014, at 15:21, Gunnar Morling <gun...@hibernate.org> wrote:
> 
> -11-26 12:42 GMT+01:00 Sanne Grinovero <sa...@hibernate.org 
> <mailto:sa...@hibernate.org>>:
> It looks like you're aiming at a "pure" mapping into primitives for
> the datagrid.
> 
> So it looks very beautiful and tempting to go for a model such as
>  > cache.put( "identifier name", ...)
> but it seems quite dangerous to me for the same reason that you store
> (conceptually):
>   {"firstname", "lastname" }, { "Emmanuel", "Bernard" }
> rather than storing:
>   { "Emmanuel", "Bernard" }
> 
> Obviously the second one looks more natural in the storage, but you're
> not really sure what these tokens were supposed to represent in case
> someone decides to refactor the model.
> I understand that it's now quite safe to remove the "tablename" in the
> per-cache-table model, as entries would still be isolated: that was
> the goal, but also it matches exactly the model proven by the RDBMs
> model.
> But there are implications in terms of flexibility and schema
> evolution if we remove the "column names" and generally speaking it's
> our only way of validating what an entry was supposed to model.
> 
> Yes, evolution is a very strong argument indeed for sticking to the current 
> approach. Without the column names (or some other form of descriptor as 
> suggested below) we will not be able to recognize the version of a given key 
> so we cannot apply any "migrations" to it, either upon loading or via some 
> sort of batch run.


Let me challenge that a bit even if I understand that there is a potential 
problem. type and id are the invariable part of the data you put in a datastore.
So the data migration / morphing does happen on the *value* much more than on 
the key itself.
You would be able to apply migrations in that case.

>  
> 
> Speaking of, like we don't normally store the "tablename" in a column
> of a table in an RDBMs, we don't really store its column names either.
> So an alternative solution which more closely matches the proven RDBMs
> model would be to store the schema representation of the table in the
> Cache:
> 
> personsCache.put( SchemaGenerationId{1}, { ORDERED_ARRAY_STRATEGY,
> "firstname", "lastname") );
> 
> then you would need to store entries linking them to a specific
> Schema, such as { "Emmanuel", "Bernard", SchemaGenerationId{1} }.
> 
> such a SchemaGenerationId would be a cheap singleton (one per
> "table"), and could be stored as efficiently as two integers (one for
> the Marshaller id and one int for the schema generation id).
> 
> ORDERED_ARRAY_STRATEGY could be an Enum, and give you some flexibility
> among your proposals.  With the current model I'd stick to the Map as
> they are the only one safe enough, but with a schema definition like
> the above description I'd definitely want to use the ordered sequence
> (array?) as it's far more efficient at all levels.
> A benefit is that I suspect that you could then transactionally evolve
> the schema, and it wouldn't be too hard for us to provide a tool to
> perform an "online schema migration".
> 
> That's an interesting idea. Or having a separate KeyDescriptor cache which 
> holds an entry for each key type? Mixing the key definition and records using 
> it within one cache seems a bit odd to me.

It is interesting. But are we in the database business?
If we are interested in this approach, maybe we should create a side project 
that offers schema atop the most common k/v?
_______________________________________________
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev

Re: [hibernate-dev] [OGM] storing the column names in the entity keys for K/V stores

Reply via email to