On 4/29/13 2:20 PM, Florian Pflug wrote:
On Apr29, 2013, at 21:00 , Atri Sharma <atri.j...@gmail.com> wrote:
I think we find work arounds or make shifts at the moment if we need
to use graphs in our database in postgres. If we have a datatype
itself, with support for commonly used operations built inside the
type itself, that will greatly simplify user's tasks, and open up a
whole new avenue of applications for us, such as recommender systems,
social network analysis, or anything that can be done with graphs.
Usually though, you'd be interested a large graphs which include
information for lots of records (e.g., nodes are individual users,
or products, or whatever). A graph datatype is not well suited for
that, because it'd store each graph as a single value, and updating
the graph would mean rewriting that whole value. If you're e.g. doing
social network analysis, and each new edge between two users requires
you to pull the whole graph from disk, update it, and write it back,
you'll probably hit problems once you reach a few hundred users or
so… Which really isn't a lot for that kind of application.
I'd love to see more support for those kinds of queries in postgres,
(although WITH RECURSIVE already was a *huge* improvement in this
area!). But storing each graph as a graph type would do isn't the
way forward, IMHO.
My $0.02:
I believe it would be best to largely separate the questions of storage and
access. Partly because of Florian's concern that you'd frequently want only one
representation of the whole graph, but also because the actual storage
interface does NOT have to be user friendly if we have a good access layer. In
particular, if rows had a low overhead, we'd probably just store graphs that
way. That's obviously not the case in PG, so is there some kind of hybrid
approach we could use? Perhaps sections of a graph could be stored with one
piece of MVCC overhead per section?
That's why I think separating access from storage is going to be very
important; if we do that up-front, we can change the storage latter as we get
real experience with this.
Second, we should consider how much the access layer should build on WITH
RECURSIVE and the like. Being able to detect specific use patterns of CTE/WITH
RECURSIVE seems like it could add a lot of value; but I also worry that it's
way to magical to be practical.
--
Jim C. Nasby, Data Architect j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers