Re: rdf, xmp

Andy Dingley Mon, 04 Dec 2006 08:06:04 -0800

Imbaud Pierre wrote:

> I have to add access to some XMP data to an existing python
> application.
> XMP is built on RDF,


I'm just looking at the XMP Spec from the Adobe SDK. First impressions
only, as I don't have time to read the whole thing in detail.

This spec doesn't inspire me with confidence as to its accuracy and
consistency. I think I've already seen some obscure conditions where
developers will be unable to unambiguously interpret the spec. Compared
to MPEG-7 however, at least it's not 700 pages long!

The spec does state that property values can be structured, which is
one of the best reasons to start using RDF for storing metadata.
However I think actual use of these would be minimal in "typical" XML
applicaations. At worst it's a simple data typing exercise of a
two-valued tuple for "dimensions", rather than separate height and
width properties. These are no problem to process.

In particular, the XMP data model is a single-rooted tree, i.e. there
is an external model of "a resource" (i.e. one image file) and an XMP
document only addresses a single "resource" at a time.

A major restriction in XMP is that it has no concept of shared
resources between properties (and it can't, as there's no rdf:ID or
rdf:about allowed). This is always hard to process, but it's also very
valuable for doing metadata. Imagine a series of wildlife images that
all refer to a particular safari, national park and species. We might
be able to share a species reference between images easily enough by
referring to a well-known public vocabulary, but it would also be
useful (and concise) to be able to define one "expedition" in a subject
property on one image, then share that same resource to others. As it
is, we'd have to duplicate the full definition. Even in XMP's "separate
document for each image resource" model we still might wish to do
something similar, such as both photographer and director being the
same person.  When you start having 20MB+ of metadata per video
resource (been there, done that!) then this sort of duplication is a
huge problem. Not just because of the data volume, but because we need
to identify that referenced resources are identical, not merely havingg
the same in their property values (i.e. I'm the same John Smith, not
just two people with the same name).

There is no visible documentation of vocabularies, inetrnal or
external. Some pre-defined schemas are given that define property sets,
but there's nothing on the values of these, or how to describe that
values are being taken from a particular external vocabulary (you can
do this with RDF, but they don't describe it).  This isn't widely seen
as important, except by people who've already been through large media
annotation projects.

It's RDF-like, not just XML. However it's also a subset of RDF - in
particular rdf:about isn't supported, which removes many of the graph
structure constructs that make RDF such a pain to process with the
basic XML tools. Read their explicit not on which RDF features aren't
supported -- they're enough to make XMP easily processable with XSLT.

The notes on embedding of XMP in XML and XML in XMP are both simplistic
and ugly.

I still don't see much _point_ in XMP.  I could achieve all this much
with two cups of coffee, RDF and Dublin Core and a whiteboard pen.
Publishing metadata is good, publishing new _ways_ of publishing
metadata is very bad!


Overall, it could be far better, it could be better without being more
complicated, and it's at least 5 years behind industry best practice
for fields like museums and libraries. It's also a field that's still
so alien to media and creative industries that the poor description and
support of XMP will cause them to invent many bad architectures and
data models for a few years to come.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: rdf, xmp

Reply via email to