Thanks! Looks like a great start.

Some questions/considerations:
 # In case of multi-level grids, I'd assume that the option is
specified only on the mapping, but doesn't need to be specified at
Query time?
# Not using anything fancy from the Lucene side correct? Just exact
term matches with boolean operators?
# Who are those geosuckers ?
# I would consider it very useful to provide scoring according to
distance; shouldn't be hard to make a custom Similarity which does
that, might be much harder to create one which is actually fast
enough, but I believe there are several examples in the Lucene
community.

To answer your question on "centeredOn", I'd avoid a strong dependency
on JTS. The "Object" option you propose could be an interface, and we
create a factory class which returns them; users should then be able
to use the JTS enabled factory or a dumb one.

Sanne

On 5 December 2011 15:21, Emmanuel Bernard <emman...@hibernate.org> wrote:
> Nicolas and I have made good progress on Geospatial queries for Hibernate 
> Search.
>
> # Geospatial indexing and queries
>
> Our goal is to give a reasonable but pragmatic answer to geoloc queries. We 
> do not try and implement the most obscure geo-projection favored by ancient 
> greeks, we do not try and find matching elements within a triangular-shaped 
> donut on Mars' surface etc. We have purposely limited the current 
> implementation to:
>
> - find matching elements in a circle (we have plans to extends to matching 
> elements in a rectangle if popular demands arise but in our opinion this will 
> not be useful or rather be misleading)
> - use the internationally accepted geo projection as it is i18n neutral and 
> not centered on one particular country. We can plan on opening to other 
> projections if the need arise (esp if data points are provided in different 
> projections).
>
> We made sure to expose as few gory details as possible.
>
> That being said, here are more information and questions.
>
> The JIRA is https://hibernate.onjira.com/browse/HSEARCH-923
> The branch is 
> https://github.com/emmanuelbernard/hibernate-search/tree/HSEARCH-923
>
> ## How is geoloc data exposed to the domain model?
>
> We plan on supporting three approaches:
>
> ### Special interface and embeddable object
>
> Using a specific interface as the property return type: 
> `o.h.s.spatial.Coordinates`
>
>    @Indexed
>    public class Address {
>        @Field String city;
>        @Spatial Coordinates location = new Coordinates() {
>            public double getLatitude() { ... }
>            public double getLongitude() { ... }
>        }
>    }
>
> ### Special interface implemented by the entity
>
> Using a specific interface implemented by the entity: 
> `o.h.s.spatial.Coordinates`
>
>    @Indexed @Spatial
>    public class Address {
>        @Field String city;
>
>        public double getLatitude() { ... }
>        public double getLongitude() { ... }
>    }
>
> ### Use JTS's Point type
>
> Use `Point` as the spatial property type.
>
> ### (maybe) `double` hosted by two unrelated properties
>
> The problem is to find a nice way to bind these properties to the spatial 
> data.
>
> ## How is geoloc data indexed
>
> There will be two strategies
>
> - index latitude and longitude as is and do a AND query matching both. This 
> typically works nicely for small datasets.
> - index latitude and longitude as matching a 15 level grid (from most generic 
> to most specific). this typically works nicely for big datasets
>
> ## Query DSL
>
> We have worked to make a fluent spatial API to the current query DSL. Before 
> we go on implementing it, we would like your feedback. Some points remains 
> open.
>
> ### General overview
>
>    builder.spatial()
>        .scoreByProximity() //not implemented yet
>        .onField("coord")
>            .boostedTo(2)
>        .within(2).km()
>            .of( coordinates )
>        .createQuery();
>
> ### onField
>
> onField is not a good name. It has a slightly meaning than when it's used in 
> range().onField(). We need to find a better name:
>
> - onField
> - onGrid
> - onCoordinates
> - onLocation
>
> This really represents the metadata set where the location will be stored. In 
> the boolean approach, we store latitude and longitude. In the grid approach, 
> we store latitude,
> longitude and the set of grids coordinates belong to.
>
> .onField() does accept a field name which can be the `Coordinates` property 
> or the virtual field used by the class-level bridge (if lat and long are top 
> level properties).
>
> When latitude and longitude are independent properties, we would use
>
>    builder.
>        .onLatitudeField("lat")
>        .andLongitudeField("lat")
>
> ### Surface checked
>
> #### Option 1: centeredOn
>
>    .centeredOn(double, double)
>    //or
>    .centeredOn()
>      .latitude(double)
>      .longitude(double)
>    //or
>    .centeredOn(SpatialIndexable)
>    .centeredOn(JTS.Point) // hard dependency on JTS even for non spatial 
> users :(
>    .centeredOn(Object) //? to avoid JTS dep
>
> - Should we have a version accepting Object?
> - What is best, centeredOn(double, double) or 
> centeredOn().latitude(double).longitude(double)?
>
> #### Option 2: in / within
>
>
>    //query within circle
>    b.spatial()
>        .onField("coord")
>        .within(2).km()
>        .of(SpatialIndexable)
>
>        .within(2).km()
>        .of()
>            .latitude()
>            .longitude()
>         .createQuery()
>
>   //or with a different unit handling
>
>    //query within circle
>    b.spatial()
>        .onField("coord")
>        .within(2, Unit.km)
>        .of(SpatialIndexable)
>
>        .within(2, Unit.km)
>        .of()
>            .latitude()
>            .longitude()
>         .createQuery()
>
> My reason to support units is that a. it's explicit and b. when those 
> geosuckers improve, we could support time units like mins or hours. Note, 
> that's a very hard problem to crack and solutions are resource intensive and 
> not very accurate. None really do it correctly, not Google for sure.
>
>
> We could support rectangles / boxes if really needed
>
>    //query in box
>    b.spatial()
>        .onField("coord")
>        .inBox()
>            .from()
>            .to()
>         .createQuery();
>
>    //more formal but more correct wrt projection
>    b.spatial()
>        .onField("coord")
>        .inBox()
>            .withUpperLeft()
>            .withLowerRight()
>         .createQuery();
>
>
> Please give us your feedback.
>
> ## TODOs
>
> - Implement fluent DSL
> - Implement Special interface implemented by the entity
> - Implement  Use JTS's Point type
> - Implement bridge that supports indexing for boolean queries based on lat 
> and long instead of the grid.
> - Implement @Spatial as a marker annotation for the spatial bridge
> - Implement variable score based on proximity
>  Today we use constant score, ie in = 1, out = 0. We can think about a score 
> that goes from 1 to 0 based on the distance from the center to the circle size
>  We can imagine queries that should return close elements above far elements.
>  Note we might need a score going from 1 to .5 or some other value. Need to 
> think about that.
> - Write how to focused doc
> - Write doc on perf comparing grid vs boolean queries
> - Convert to JBoss logging
> - Add unit test using faceting + spatial queries
>
> Emmanuel
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev

_______________________________________________
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev

Reply via email to