2014-11-26 17:28 GMT+01:00 Emmanuel Bernard <emman...@hibernate.org>:
> > > On 26 Nov 2014, at 15:21, Gunnar Morling <gun...@hibernate.org> wrote: > > > > -11-26 12:42 GMT+01:00 Sanne Grinovero <sa...@hibernate.org <mailto: > sa...@hibernate.org>>: > > It looks like you're aiming at a "pure" mapping into primitives for > > the datagrid. > > > > So it looks very beautiful and tempting to go for a model such as > > > cache.put( "identifier name", ...) > > but it seems quite dangerous to me for the same reason that you store > > (conceptually): > > {"firstname", "lastname" }, { "Emmanuel", "Bernard" } > > rather than storing: > > { "Emmanuel", "Bernard" } > > > > Obviously the second one looks more natural in the storage, but you're > > not really sure what these tokens were supposed to represent in case > > someone decides to refactor the model. > > I understand that it's now quite safe to remove the "tablename" in the > > per-cache-table model, as entries would still be isolated: that was > > the goal, but also it matches exactly the model proven by the RDBMs > > model. > > But there are implications in terms of flexibility and schema > > evolution if we remove the "column names" and generally speaking it's > > our only way of validating what an entry was supposed to model. > > > > Yes, evolution is a very strong argument indeed for sticking to the > current approach. Without the column names (or some other form of > descriptor as suggested below) we will not be able to recognize the version > of a given key so we cannot apply any "migrations" to it, either upon > loading or via some sort of batch run. > > Let me challenge that a bit even if I understand that there is a potential > problem. type and id are the invariable part of the data you put in a > datastore. > So the data migration / morphing does happen on the *value* much more than > on the key itself. > You would be able to apply migrations in that case. > True, the need for evolution will be higher for the values, but can we really completely rule it out for keys in stores without a fixed-schema? It seems to be a restriction we'd apply, whereas a user otherwise would be free to e.g. add a column to the key. > > > > Speaking of, like we don't normally store the "tablename" in a column > > of a table in an RDBMs, we don't really store its column names either. > > So an alternative solution which more closely matches the proven RDBMs > > model would be to store the schema representation of the table in the > > Cache: > > > > personsCache.put( SchemaGenerationId{1}, { ORDERED_ARRAY_STRATEGY, > > "firstname", "lastname") ); > > > > then you would need to store entries linking them to a specific > > Schema, such as { "Emmanuel", "Bernard", SchemaGenerationId{1} }. > > > > such a SchemaGenerationId would be a cheap singleton (one per > > "table"), and could be stored as efficiently as two integers (one for > > the Marshaller id and one int for the schema generation id). > > > > ORDERED_ARRAY_STRATEGY could be an Enum, and give you some flexibility > > among your proposals. With the current model I'd stick to the Map as > > they are the only one safe enough, but with a schema definition like > > the above description I'd definitely want to use the ordered sequence > > (array?) as it's far more efficient at all levels. > > A benefit is that I suspect that you could then transactionally evolve > > the schema, and it wouldn't be too hard for us to provide a tool to > > perform an "online schema migration". > > > > That's an interesting idea. Or having a separate KeyDescriptor cache > which holds an entry for each key type? Mixing the key definition and > records using it within one cache seems a bit odd to me. > > It is interesting. But are we in the database business? > If we are interested in this approach, maybe we should create a side > project that offers schema atop the most common k/v? > It's a grey area. It'd basically be a way to describe the "schema" for each single record in a more efficient manner. It'd not be a schema description per table/cache. I guess that's one of the general issues of K/V stores which don't know much about the data; A document store at least know the syntactical structure and could store field names via references to a shared constant pool rather than persisting them within each document. > _______________________________________________ > hibernate-dev mailing list > hibernate-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/hibernate-dev > _______________________________________________ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev