Just have time to add to your random thoughts ...

With respect to cross datastore joins; a current project gives me a scope to 
join you on this work (or help clean up some of the mess depending on your 
deadlines). My focus is pretty narrow - to enable a shapefile to be joined with 
a CSV file by ID.

-------
There is an old proposal with code on this topic from David Zwiers (from last 
time they tried to add it to wfs).

>From what I remember:
- Query gained more methods to support "As" functionality; this resulted in a 
normal feature collection in which the attributes had been renamed. This was 
generally useful and not specific to "Join".
- A class Join was created; that combined two queries; with similar use of 
PropertyEquals to define the join
- DataStore.getFeatures(Join) was created to allow people to issue the request
- Repository.getFeatures(Join) was created - for cross datastore joins. Today I 
would handle that as featureSource.join( FeatureSource, Join )
- I like this a bit better than your proposal as Query stays pretty focused and 
is reused during the join
--------
There was also the design of adding methods to feature source; allowing it to 
be used rather than feature collection. Personally I think it may be a smoother 
way to handle Join.

// query includes the "as" expressions used to rename attributes
FeatureSource left = featureSource.getFeatureSource( typeName1 ).query( query1 
);
FeatureSource right = featureSource.getFeatureSource( typeName2 ).query( query2 
);

// join is a method on feature source
FeatureSource join1 = left.join( right, List<PropertyEquals> );
FeatureSource join2 = left.join( right, Join );

--------
For your discussion on the different kinds of features that could result; I 
would assume that the resulting structure would match the xpath expressions 
used during the join? Or does the specification give us any feedback here?

-- 
Jody Garnett


On Tuesday, 14 June 2011 at 5:16 AM, Justin Deoliveira wrote:

> Hi all,
> 
> The last major bit of the wfs 2 work is joins. I wanted to start some 
> discussion here and post some questions with regard to the work.
> 
> So with the wfs protocol you can do queries now that look like this:
> 
> <wfs:Query typeNames="myns:Person myns:Person" aliases="a b">
> <fes:Filter>
> <fes:And> 
> <fes:PropertyIsEqualTo>
> <fes:ValueReference>a/Identifier</fes:ValueReference>
> <fes:Literal>12345</fes:Literal>
> </fes:PropertyIsEqualTo> 
> <fes:PropertyIsEqualTo>
> <fes:ValueReference>a/spouse<fes:ValueReference>
> <fes:ValueReference>b/Identifier</fes:ValueReference> 
> </fes:PropertyIsEqualTo>
> </fes:And> 
> </fes:Filter>
> </wfs:Query>
> 
> The result is a feature collection that does not contain only feature 
> members, but tuples of feature members. Something like:
> 
> <wfs:member>
> <wfs:Tuple>
> <wfs:member>
> <ns1:FeatureTypeOne>...</ns1:FeatureTypeOne> 
> </wfs:member>
> <wfs:member>
> <ns1:FeatureTypeTwo>...</ns1:FeatureTypeTwo> 
> </wfs:member> 
> </wfs:Tuple> 
> </wfs:member>
> 
> With that providing a bit of context I would like to bring up some points of 
> discussion.
> 
> * app-schema vs simple features
> 
> With knowing zero about app-schema currently I believe there is the ability 
> to do joins via feature chaining. However my impression is that these 
> relationships are configured before hand and not really created on the fly? 
> Correct me if I am wrong.
> 
> So perhaps we could just say that we support joins with app-schema and call 
> it a day. However that said I do think there is a case for supporting joins 
> with simple features as well. And to be honest working with app-schema, 
> because of the learning curve, would be out of scope for this project.
> 
> * cross datastore joins
> 
> When talking about doing joins there are varying levels of complexity. For 
> instance talking about supporting joins of feature types within a jdbc 
> datastore is one thing. Supporting joining say a shapefile feature type to a 
> jdbc feature type is a total different ball of wax. Doing cross datastore 
> joins is something i think would be neat... but far from trivial to do it in 
> a way that scales. A much simpler problem would be joining two feature types 
> within the same datastore. However still unless the datastore is one that can 
> do joins natively (jdbc is really the only one here) it is still a hard 
> problem. For instance consider attempting to join two Shapefile feature types 
> from the same datastore... doable but again difficult to do in a non naive 
> way.
> 
> * query interface
> 
> Given that only some datastores can do joins efficiently makes it a good 
> candidate for QueryCapabilities with the addition of a method 
> "isJoiningSupported". That interface change is relatively straight forward. 
> However one that is not is how to modify Query (if that is the way to go) to 
> support joins. I can think of a few different strategies:
> 
> 1. Not modify it at all and come up with a new interface called 
> "JoinSupportingDataStore" or something that adds some new methods for joins.
> 
> 2. Subclass Query and add some new join methods. Looking around I actually 
> notice that there is some code in app-schema that does just this called 
> JoiningQuery
> 
> 3. Modify Query directly to add support for joins
> 
> Thoughts? When I thought about the alternatives I thought (3) made the most 
> sense. Especially given how we support other concepts that are not supported 
> in all datastores like sorting.
> 
> So I decided to go further with (3), and added a class called "Join", that 
> looks something like the following:
> 
> class Join {
> 
>  /** the feature type being joined to */
> 
>  String getTypeName();
> 
>  /** the attributes from the joined feature type to select */
> 
>  List<PropertyName> getProperties() 
> 
>  /** the join filter */
> 
>  Filter getJoinFilter();
> 
>  /** additional filter to apply to the feature type being joined to */
> 
>  Filter getFilter();
> 
> }
> 
> And then it was a matter of modifying Query adding a new property.
> 
> class Query {
> 
>  List<Join> getJoins();
> 
> }
> 
> So with this api the above query would look something like this: 
> 
> Query q = new Query("Persons");
> q.setFilter(PropertyIsEqualTo(PropertyName("Identifer"), Literal(12345)));
> 
> Join j = new Join("Persons"); 
> j.setJoinFilter(PropertyIsEqualTo(PropertyName("spouse"), 
> PropertyName("Identifer")));
> q.getJoins().add(j);
> 
> That is obviously simplified quite a bit... there still a few things to iron 
> out like handling name clashes, etc... but that would be the general idea. 
> Thoughts? 
> 
> * joined features
> 
> Another major question is what should the result of a join look like? Given 
> that the current return from a query is features I thought it best to stick 
> with that not come up with some new class or something to represent a tuple 
> (although maybe that is something worth considering). I thought of a few 
> different alternatives. To illustrate consider two feature types: 
> 
> f1 (name, geometry)
> f2 (name, foo, geom)
> 
> 1. Return a single feature with attributes from joined feature types "rolled 
> into it". So the resulting joined feature would look like: 
> 
>  f'(name, geometry, name, foo, geom)
> 
> 2. Return a single feature that contains attributes for joined features:
> 
>  f'(name, geometry, f2) 
> 
> 3. Return a single feature that contains attributes for all features in the 
> join
> 
>  f'(f1,f2)
> 
> All methods have their various issues. (1) for instance requires that we 
> break simple feature rules since we have two attributes with the same local 
> name. 
> 
> (2) requires us to have attribute types that are SimpleFeatureType. Which I 
> don't think technically violates simple feature rules although admittedly not 
> something that happens often.
> 
> (3) Same more or less as (2) but more represents the notion of the "tuple". 
> Question is what id to give to the feature? If any?
> 
> Pretty open to suggestions on this one... i imagine there is probably a 
> better solution than any of those three. In the end with the prototype i 
> decided to go with (2). Seemed the least invasive. 
> 
> * join types
> 
> Joins come in many flavors... inner vs outer, etc... The wfs spec specifies 
> that the semantics are that of an inner join. But I guess we could add some 
> notion of join type to the join class so that a user could specify which type 
> of join they want? Or maybe just stick with inner join since that is the 
> requirement and the most common case? 
> 
> That is about it for now... sorry it's a lot random thoughts i know. I 
> currently have a basic implementation working in the jdbc module. It needs 
> testing and to handle some more special cases but with it I have been able to 
> do a variety of joins, both "standard" and spatial. 
> 
> Thoughts and feedback welcome. Thanks folks.
> 
> -Justin 
> 
> --
> Justin Deoliveira
> OpenGeo - http://opengeo.org
>  Enterprise support for open source geospatial.
> ------------------------------------------------------------------------------
> EditLive Enterprise is the world's most technically advanced content
> authoring tool. Experience the power of Track Changes, Inline Image
> Editing and ensure content is compliant with Accessibility Checking.
> http://p.sf.net/sfu/ephox-dev2dev
> _______________________________________________
> Geotools-devel mailing list
> [email protected] 
> (mailto:[email protected])
> https://lists.sourceforge.net/lists/listinfo/geotools-devel

------------------------------------------------------------------------------
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
_______________________________________________
Geotools-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geotools-devel

Reply via email to