On 05/02/2012 09:56 PM, Jason Roberts wrote:
Hi Ari,

I have been following this proposal with interest. I have a couple of
questions; sorry to bring them up late in the RFC process.

It looks like the inspiration for this was the Overlay toolset in the ArcGIS
geoprocessing toolbox. That is nice for people like me who regularly work
with ArcGIS; it increases the possibility of being able to replace
automation currently based on ArcGIS with something based on GDAL. Assuming
you are aiming for basic parity with ArcGIS:

ArcGIS is an example which is assumably pretty good, and I'm not aware of any standards or better ones so that's the reason. Personally, I don't use ArcGIS, so I can't say much.


1. Would you consider implementing a Symmetrical Difference method? It is
the only one from the ArcGIS Overlay toolset that you did not implement
(except Spatial Join, which does not really apply here).

I'd be happy to. Can you point me to a description of the method?


2. In ArcGIS, the Intersect and Union tools support multiple inputs. Is that
something you would consider implementing? I don't have a strong opinion on
this. It is occasionally necessary to use all of these tools with multiple
inputs (e.g. erase several layers from a starting layer). I'm not
immediately sure why ESRI elected to support multiple inputs for two of
their tools but not more of them.

We've chosen to implement these tools as C = A.method(B), i.e., they always produce a result layer and not change A nor B (the C has to be created before the actual method). For Update, Clip and Erase it might make sense (as the schema of the result layer is only that of input layer) to have the method be an in-place method: A.method(B), i.e., change A. Then it would be trivial to write

for all B in (a list of method layers)
    A.method(B)

I would in fact support this syntax as it is more versatile and saves memory in some cases. The downside is that it does not work with read-only data sources and thus is probably not suitable for GDAL.

In Intersection, Union and Identity the schema of the result may contain attributes from the method layer, thus IMO they do not fit so well into the in-place syntax.

In the current syntax, to run a method on several method layers:

for all B in (a list of method layers)
    C = new Layer like A
    A.method(B, C)
    delete A
    A = C

It really seems to me that this would be practical only for memory layers. Maybe the support for multiple layers should be left for the command line tool or scripting language layer.

For example I'll probably build a support into the Perl bindings for the in-place syntax.


3. Will these functions be exposed from the Python bindings?

They are already, if you apply the patch. The test code is written in Python.

Thanks for the interest.

Ari


Best regards,

Jason

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Ari Jolma
Sent: Wednesday, April 25, 2012 9:33 AM
To: [email protected]
Subject: [gdal-dev] RFC 39

Folks,

I'll start a new thread for this RFC.

I've uploaded a new patch, which hopefully corrects the issues Even brought
up. Below are comments.

I've not added the Append or Buffer methods, which basically work on just
one layer. I believe such functionality is simple enough to create using
feature iteration.

Ari

the patch:
http://trac.osgeo.org/gdal/attachment/wiki/rfc39_ogr_layer_algebra/rfc39.pat
ch


On 04/19/2012 02:04 PM, Even Rouault wrote:
http://trac.osgeo.org/gdal/wiki/rfc39_ogr_layer_algebra

Ari,

I have reviewed your proposal and it looks interesting. Here are my
observations
:

1) Argument order. A.Operation(B, C) where C is the target layer
doesn't seem very intuitive, but I'm not sure I have something
fundamentally better to propose. An alternate solution would be to
make the operations static methods, and have
OGRLayer::Operation(output_layer, input_layer, method_layer) ? I'd
suggest at least to rename B as method_layer and C as result_layer to be
more explicit on the role of each argument.


I've changed the names to Input, Method and Output. For example the layer
object pointers are pLayerInput etc.



2) It would perhaps be prudent to add a char** papszOptions argument as a
provision for future needs.
I've added it.


3) I'm wondering if it would not be more appropriate to separate the
creation of
fields of the output layer in a separate method that might be called, or
not,
before the real operation. For example IdentifyPrepare() for Identity().
Of
course this would complicate a bit the implementation of the operation
because
it will need to look for which fields exist in the output layer and how
the
input and method layer fields map to those fields. But the establishment
of the
mapping and the setting of the output feature from the fields of the input
and
method features could be made in common methods

     This way the XXXXXPrepare() would not need to check that CreateField()
succeeds because XXXX() could work with an output layer with no fields at
all.
Checking that there are not fields with same name would not be necessary
(in
that case the value from B would override the value from A).

I've created a new method into Feature class: SetFieldsFrom, which uses
a map. It is similar to SetFrom, which uses a map. (thus the field copy
could be optimized into one function)

The layer methods now first check the existence of fields in the result
layer. If there are fields, it is assumed that that is what the user
wants. If there are no fields, a complete set of fields are created for
fields in input and method layers (in some cases only input layer). This
behavior could later be overridden with options.


4) It would be nice to able to provide a progress callback to get progress
report (and also being able to interrupt) for such potentially long
operations.


I agree. I've added these to the API, but I've not yet added the calls.
This requires for example the inclusion of gdal.h into ogr_api.h.


5) A few remarks on the implementation itself (only code review, no actual
testing). You can likely wait for comments from others before heading to
fix
them...

a) CreateField() and CreateFeature() return code should be tested and
acted
upon. Likely interrupt the whole process ? (not strictly necessary for
CreateField() is you adopt the idea developed in point 3))

I've added tests. I've used the "goto done;" style, where done is a
label just before the release of used resources and exit from the
method. I've tried to be careful to leave no memory leaks.


b) I think there's something wrong with OGRGeometry *Bfilter =
B->GetSpatialFilter(). When you later do
B->SetSpatialFilter(x_geom->Intersection(Bfilter)), I believe it will
destroy
the previous installed filter, and thus Bfilter will point to a garbage
object.
You likely need to clone the initial spatial filter (and delete it after
restoring it at the end of the method)

Yes. I checked all calls to GetSpatialFilter and now clone the initial
one and destroy it at the end.


c) Each time you do a new OGRFeature(), you need do delete it (after the
call to
CreateFeature()).
They are now deleted.

d) GetGeometryRef() can return NULL (even for spatial layers). Currently
it is
never checked, and there are a few places where it is dereferenced.
The validity of the geometry is checked. If there is no geometry, the
next feature is fetched.

e) Lines like x_geom_diff = x_geom_diff->Difference(y_geom) (or geom =
geom->Union(y_geom)) will leak the initial x_geom_diff object. You likely
need
to do :
        OGRGeometry* x_geom_diff_new = x_geom_diff->Difference(y_geom);
        delete x_geom_diff;
        x_geom_diff = x_geom_diff_new;
Fixed.

f) The object returned by GetNextFeature() should be deleted after use.
It is deleted, even if the iteration is continued.

g) You should be careful with SetGeometryDirectly(). You cannot pass
directly
the geometry belonging to another feature because this will segfault when
you
free the said feature.
Fixed

h) Nothing wrong, but calls to UnsetField() are unnecessary for a feature
that
was just created and never set before (all the lines z->UnsetField(...) )
The unnecessary code is removed.

_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev

_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev

_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to