Hi all,

The last major bit of the wfs 2 work is joins. I wanted to start some
discussion here and post some questions with regard to the work.

So with the wfs protocol you can do queries now that look like this:

<wfs:Query typeNames="myns:Person myns:Person" aliases="a b">
 <fes:Filter>
  <fes:And>
    <fes:PropertyIsEqualTo>
      <fes:ValueReference>a/Identifier</fes:ValueReference>
      <fes:Literal>12345</fes:Literal>
    </fes:PropertyIsEqualTo>
    <fes:PropertyIsEqualTo>
      <fes:ValueReference>a/spouse<fes:ValueReference>
      <fes:ValueReference>b/Identifier</fes:ValueReference>
    </fes:PropertyIsEqualTo>
  </fes:And>
 </fes:Filter>
</wfs:Query>

The result is a feature collection that does not contain only feature
members, but tuples of feature members. Something like:

<wfs:member>
 <wfs:Tuple>
   <wfs:member>
       <ns1:FeatureTypeOne>...</ns1:FeatureTypeOne>
   </wfs:member>
   <wfs:member>
      <ns1:FeatureTypeTwo>...</ns1:FeatureTypeTwo>
   </wfs:member>
  </wfs:Tuple>
</wfs:member>

With that providing a bit of context I would like to bring up some points of
discussion.

* app-schema vs simple features

With knowing zero about app-schema currently I believe there is the ability
to do joins via feature chaining. However my impression is that these
relationships are configured before hand and not really created on the fly?
Correct me if I am wrong.

So perhaps we could just say that we support joins with app-schema and call
it a day. However that said I do think there is a case for supporting joins
with simple features as well. And to be honest working with app-schema,
because of the learning curve, would be out of scope for this project.

* cross datastore joins

When talking about doing joins there are varying levels of complexity. For
instance talking about supporting joins of feature types within a jdbc
datastore is one thing. Supporting joining say a shapefile feature type to a
jdbc feature type is a total different ball of wax. Doing cross datastore
joins is something i think would be neat... but far from trivial to do it in
a way that scales. A much simpler problem would be joining two feature types
within the same datastore. However still unless the datastore is one that
can do joins natively (jdbc is really the only one here) it is still a hard
problem. For instance consider attempting to join two Shapefile feature
types from the same datastore... doable but again difficult to do in a non
naive way.

* query interface

Given that only some datastores can do joins efficiently makes it a
good candidate for QueryCapabilities with the addition of a method
"isJoiningSupported". That interface change is relatively straight forward.
However one that is not is how to modify Query (if that is the way to go) to
support joins. I can think of a few different strategies:

1. Not modify it at all and come up with a new interface called
"JoinSupportingDataStore" or something that adds some new methods for joins.

2. Subclass Query and add some new join methods. Looking around
I actually notice that there is some code in app-schema that does just this
called JoiningQuery

3. Modify Query directly to add support for joins

Thoughts? When I thought about the alternatives I thought (3) made the most
sense. Especially given how we support other concepts that are not supported
in all datastores like sorting.

So I decided to go further with (3), and added a class called "Join", that
looks something like the following:

class Join {

  /** the feature type being joined to */

  String getTypeName();

  /** the attributes from the joined feature type to select */

  List<PropertyName> getProperties()

  /** the join filter */

  Filter getJoinFilter();

  /** additional filter to apply to the feature type being joined to */

  Filter getFilter();

}

And then it was a matter of modifying Query adding a new property.

class Query {

  List<Join> getJoins();

}

So with this api the above query would look something like this:

Query q = new Query("Persons");
q.setFilter(PropertyIsEqualTo(PropertyName("Identifer"), Literal(12345)));

Join j = new Join("Persons");
j.setJoinFilter(PropertyIsEqualTo(PropertyName("spouse"),
PropertyName("Identifer")));
q.getJoins().add(j);

That is obviously simplified quite a bit... there still a few things to iron
out like handling name clashes, etc... but that would be the general idea.
Thoughts?

* joined features

Another major question is what should the result of a join look like? Given
that the current return from a query is features I thought it best to stick
with that not come up with some new class or something to represent a tuple
(although maybe that is something worth considering). I thought of a few
different alternatives. To illustrate consider two feature types:

f1 (name, geometry)
f2 (name, foo, geom)

1. Return a single feature with attributes from joined feature types "rolled
into it". So the resulting joined feature would look like:

  f'(name, geometry, name, foo, geom)

2. Return a single feature that contains attributes for joined features:

  f'(name, geometry, f2)

3. Return a single feature that contains attributes for all features in the
join

  f'(f1,f2)

All methods have their various issues. (1) for instance requires that we
break simple feature rules since we have two attributes with the same local
name.

(2) requires us to have attribute types that are SimpleFeatureType. Which I
don't think technically violates simple feature rules although admittedly
not something that happens often.

(3) Same more or less as (2) but more represents the notion of the "tuple".
Question is what id to give to the feature? If any?

Pretty open to suggestions on this one... i imagine there is probably a
better solution than any of those three. In the end with the prototype i
decided to go with (2). Seemed the least invasive.

* join types

Joins come in many flavors... inner vs outer, etc... The wfs spec specifies
that the semantics are that of an inner join. But I guess we could add some
notion of join type to the join class so that a user could specify which
type of join they want? Or maybe just stick with inner join since that is
the requirement and the most common case?

That is about it for now... sorry it's a lot random thoughts i know. I
currently have a basic implementation working in the jdbc module. It needs
testing and to handle some more special cases but with it I have been able
to do a variety of joins, both "standard" and spatial.

Thoughts and feedback welcome. Thanks folks.

-Justin

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.
------------------------------------------------------------------------------
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
_______________________________________________
Geotools-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geotools-devel

Reply via email to