On Jul 15, 2005, at 12:06 AM, Andy Jefferson wrote:
Is it possible to default the value of the <field> serialized
attribute to "true" for embedded non-PC types? That would immediately
solve our current Collection, Set, and Map issues, without having
extra stuff in the metadata file.
Hi Craig,
Having gone back to the original problem report, I'll give my interpretation
of what the various embedded attributes means (and what I've implemented),
and you can tell me where I've gone wrong :-)
I'm assuming here that you are putting both the jdo and orm metadata into one file. Splitting the information into multiple files is possible, where the orm metadata overrides the jdo metadata.
Example 1 : Collection of BigDecimal
1. Basic collection
<field name="myfield">
<collection element-type="java.math.BigDecimal"/>
<join/>
</field>
This creates 2 tables - 1 for the class owning "myfield", and 1 join table to
contain the elements. If <join> is omitted then an error should be thrown
(though i'm not sure if JPOX currently flags this up)
The join element has no defaults, so this is not sufficient to describe the mapping. You need at least a column attribute naming the join column. And you need to name the column in the join table to map the BigDecimal values to. So,
<field name="myfield" column="VALUES" table="MYFIELD_TABLE">
<collection element-type="java.math.BigDecimal"/>
<join column="JOIN_COLUMN"/>
</field>
2. Serialised collection
<field name="myfield" serialized="true">
<collection element-type="java.math.BigDecimal"/>
<join/>
</field>
This creates 1 tables, with a BLOB column for "myfield" containing all
elements.
With just one table, you don't need the join element at all. But you do need the column,
<field name="myfield" serialized="true" column="MYFIELD_BLOB">
<collection element-type="java.math.BigDecimal"/>
</field>
or, equivalently,
<field name="myfield" column="MYFIELD_BLOB">
<collection element-type="java.math.BigDecimal" embedded-element="true" />
</field>
embedded-element has no effect with this example because the element
(BigDecimal) is already embedded (in the join table), and has no way of not
being embedded.
I'd say that serialized implies embedded-element (and vice-versa, which is why I'm now questioning the value of serialized as an attribute).
================================
Example 2 : Collection of PC
1. Basic collection with join
<field name="myfield">
<collection element-type="MyElement"/>
<join/>
</field>
This creates 3 tables - 1 for the class owning "myfield", 1 join table and 1
for the element class.
Again, you need to further define the table and columns used for the join.
<field name="myfield" table="JOIN_TABLE" column="FK_TO_OTHER">
<collection element-type="MyElement"/>
<join column="JOIN_COLUMN" />
</field>
2. Basic collection with FK
<field name="myfield">
<collection element-type="MyElement"/>
</field>
This creates 3 tables - 1 for the class owning "myfield", 1 for the element
class (with FK back to the owner table).
Did you mean 2 tables (I think so). The two tables are related by a FK on the MyElement side.
<field name="myfield" table="MY_ELEMENT_TABLE" column="MY_FIELD_FK">
<collection element-type="MyElement"/>
</field>
3. Embedded element
<field name="myfield">
<collection element-type="MyElement" embedded-element="true"/>
<join/>
</field>
This creates 3 tables - 1 for the class owning "myfield", and 1 join table
containing the elements (columns aligned with the fields of the PC element).
Since it's embedded, I think there is only one table that contains all the fields in the class, including the Collection of MyElement.
You can't map the columns of an embedded Collection of PC elements because you would need one column for each field in each PC, which is a variable number of columns. And tables have a fixed number of columns. So the mapping has to either serialize the Collection and store it into a BLOB column or use another table. For embedded collection,
<field name="myfield" column="MYFIELD_BLOB_COLUMN">
<collection element-type="MyElement" embedded-element="true"/>
</field>
Here, I'd even say that if there is a column given for a collection, we might default to embedded-element="true" because that's the only way to embed a collection.
4. Embedded element (fully specified)
<field name="myfield">
<collection element-type="MyElement" embedded-element="true"/>
<join/>
<element>
<embedded>
... (full spec of field mappings)
</embedded>
</element>
</field>
This creates 3 tables - 1 for the class owning "myfield", and 1 join table
containing the (embedded) elements (columns aligned with the mappings
specified in the <embedded> section above.
This example doesn't make sense to me because it's no longer embedded if the values are in a different table. And as above, you can't map a variable number of columns (required for a collection) to a fixed number of columns (in the table).
5. Embedded collection
<field name="myfield" embedded="true">
<collection element-type="MyElement"/>
</field>
No idea what is expected here .. maybe the same as "serialized" below ?
Embedded refers to the collection itself, not to the elements. So there is no explicit mapping of the collection, just the elements of the collection. This is the same as number 2, and I believe the mapping is incomplete.
6. Serialised collection
<field name="myfield" serialized="true">
<collection element-type="MyElement"/>
</field>
This creates 1 table, with 1 BLOB column for "myfield" to include the
serialised collection.
I think that serialized="true" has the same semantics as embedded-element="true". And the mapping is incomplete.
================================
Example 3 : PC with PC field
1. Basic 1-1
<field name="myfield" persistence-modifier="persistent"/>
This creates 2 tables, one for each class, with a FK
Right, and the FK field has to be named for a complete mapping.
<field name="myfield" persistence-modifier="persistent" column="OTHER_SIDE_FK"/>
2. Embedded PC
<field name="myfield" persistence-modifier="persistent" embedded="true">
</field>
This creates 1 table, with all fields of the PC having their own columns in
the table of the owner PC. Not actually implemented in JPOX (though it will
be fast to do)
I don't see the difference between this and number 3. You can default the values in a non-standard way if you like.
3. Embedded PC (fully specified)
<field name="myfield" persistence-modifier="persistent">
<embedded>
.... (mappings for other PC class)
</embedded>
</field>
This creates 1 table, with the fields of the other PC having their own columns
in the table of the owner PC
Right. The embedded element contained in the field element implies that the field is embedded in the same table.
4. Serialised PC
<field name="myfield" persistence-modifier="persistent" serialised="true"/>
Not yet implemented. I'd assume we just dump the whole of the PC into a BLOB
column, but don't see much use for this :-)
Yes, I don't expect an issue. Not having individual fields mapped to columns might be a performance advantage but it's otherwise not too useful IMHO.
================================
There are complete details of all of these in the JPOX online docs, showing
the classes, the MetaData, and the resultant DB schema.
If there's some misunderstanding above of what the spec wants, I'd rather know
it now :-)
<spec>
The embedded attribute specifies whether the field should be stored
as part of the containing instance instead of as its own instance in
the datastore. It must be specified or default to "true" for fields
of primitive types, wrappers, java.lang, java.math, java.util,
collection, map, and array types specified above; and "false"
otherwise.
</spec>
I'm ok with this statement for all types *except* Collections/Maps.
Collections/Maps have methods therefore I can't see why they should be
"embedded" by default - they have methods (that we can mimic in JDOQL)
therefore we want to query them. Arrays don't have methods (that we can mimic
in JDOQL) therefore they can be embedded (in a BLOB column). Really depends
on what "embedded" means at <field> level ... is it serialised or something
else ?
This is a continuing source of misunderstanding that I'd like to get clarified, but it's hard.
In object databases, object-relational databases, and relational databases with SQL99 extensions, Collections, Maps, and Arrays are complex types that can have their own identity in the database. For example a Collection is stored as a Collection type with an object identity and references to other objects, whether wrapper types or user-defined types.
In pure relational databases, Collections, Maps, and Arrays are not stored directly but are only in-memory representations of the structures in the database. In JDO we consider these in-memory structures to be "embedded" because there's just no place to put them in the database. But the elements, keys, and values can be stored in the database. Things like Integer, Float etc have columns that store the values directly. Things like CartItem have multiple columns to represent them.
The only way to store collections of Integer or CartItem in the same table as the other fields is to serialize them. If they are stored in a join table, they don't need to be serialized. Each of the elements can be stored as a row in a join table, because you can have multiple rows in the join table that have a foreign key back to the row in the primary table where the primitive fields are stored.
It's been on my to-do list for the specification for a while to add mapping for arrays, lists, sets, and maps to Chapter 15. This might be the time to actually do it.