[GENERAL] DB design: How to store object properties?

Maxim Khitrov Sun, 17 Feb 2008 13:52:39 -0800

Greetings everyone,

I'm fairly new to PostgreSQL, but I'm currently doing some research on
tools to be used for an upcoming project and could really use your
help with a possible database design.


So the scenario is this. We have two projects starting that will deal
heavily with mapping spatial regions. One of the reasons I'm looking
at PostgreSQL is the PostGIS extension that may help us in dealing
with all the geometry. The first project will be mapping the outside
environment. Think along the lines of Google Maps, but intended for
use by pedestrians rather than cars. The other project, one that I'll
be managing, will be doing the same thing but with interiors of
buildings. For now, the two projects will be separate, but eventually
the plan is to merge them and actually allow the system to plan paths
that incorporate outside and interior maps.

That's the project description in a nutshell. What I'm doing now is
trying to figure out the best way to store all the spatial
information. I want to have a foundational layer that is as simple as
possible, and that can later be extended for any special needs.

The current plan is to do all the mapping via a graph structure using
nodes/vertices and edges. A node will be defined simply in terms of a
GUID (UUID in PostgreSQL). Each node will have a unique id, but no
other "mandatory" information. An edge is defined by the ids of its
two end points. This setup will be common to both projects, and the
use of UUIDs will ensure that any database merges in the future will
not result in conflicts.

Now here's the tricky bit. We have to be able to associate an
arbitrary number of properties with each node and edge. So an example
of a node property would be that node's position defined by a
latitude/longitude pair. We will define all the possible properties
between our two projects, but just because a property is valid,
doesn't mean that it will be present in or be relevant to every node
in the system. For instance, nodes located inside of a building will
likely have their position defined by something other than latitude
and longitude, since capturing that data with a GPS will not be
possible.

The simplest design would be to create two tables, one for nodes
another for edges, and create a column for every possible property.
This, however, is huge waste of space, since there will not be a
single node or edge that will make use of all the defined properties.
There may be hundreds of properties, but each node may use ten on
average. That's the question - how do you represent this information
in the database in a space-efficient manner. We still have to be able
to search for specific nodes given their properties. For example –
find all the nodes located within radius z of coordinate x,y. PostGIS
will handle the special bit, but there has to be an efficient way of
accessing this information/property for each node that has it.

One other possibility I thought of was defining a master node table
that would contain just the node ids. Then for each property I would
create a separate table with the id column referencing the master.
Here we're still wasting space for the extra storage of UUIDs, and if
I want to run a query that selects nodes with several different
properties, it would require a long list of JOINs.

I looked at table inheritance that PostgreSQL offers, but it turned
out to be something different from what I was hoping for. What I need
is some sort of data inheritance based on a common unique key, but I
don't think that's possible. Same thing with partitioning. If there
was a way to partition the columns (with data) of a single large
table, then that would be an ideal solution. But in the absence of
that feature, what design would you recommend?

Thanks,
Max

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to [EMAIL PROTECTED] so that your
       message can get through to the mailing list cleanly

[GENERAL] DB design: How to store object properties?

Reply via email to