Thanks! Looks like a great start. Some questions/considerations: # In case of multi-level grids, I'd assume that the option is specified only on the mapping, but doesn't need to be specified at Query time? # Not using anything fancy from the Lucene side correct? Just exact term matches with boolean operators? # Who are those geosuckers ? # I would consider it very useful to provide scoring according to distance; shouldn't be hard to make a custom Similarity which does that, might be much harder to create one which is actually fast enough, but I believe there are several examples in the Lucene community.
To answer your question on "centeredOn", I'd avoid a strong dependency on JTS. The "Object" option you propose could be an interface, and we create a factory class which returns them; users should then be able to use the JTS enabled factory or a dumb one. Sanne On 5 December 2011 15:21, Emmanuel Bernard <emman...@hibernate.org> wrote: > Nicolas and I have made good progress on Geospatial queries for Hibernate > Search. > > # Geospatial indexing and queries > > Our goal is to give a reasonable but pragmatic answer to geoloc queries. We > do not try and implement the most obscure geo-projection favored by ancient > greeks, we do not try and find matching elements within a triangular-shaped > donut on Mars' surface etc. We have purposely limited the current > implementation to: > > - find matching elements in a circle (we have plans to extends to matching > elements in a rectangle if popular demands arise but in our opinion this will > not be useful or rather be misleading) > - use the internationally accepted geo projection as it is i18n neutral and > not centered on one particular country. We can plan on opening to other > projections if the need arise (esp if data points are provided in different > projections). > > We made sure to expose as few gory details as possible. > > That being said, here are more information and questions. > > The JIRA is https://hibernate.onjira.com/browse/HSEARCH-923 > The branch is > https://github.com/emmanuelbernard/hibernate-search/tree/HSEARCH-923 > > ## How is geoloc data exposed to the domain model? > > We plan on supporting three approaches: > > ### Special interface and embeddable object > > Using a specific interface as the property return type: > `o.h.s.spatial.Coordinates` > > @Indexed > public class Address { > @Field String city; > @Spatial Coordinates location = new Coordinates() { > public double getLatitude() { ... } > public double getLongitude() { ... } > } > } > > ### Special interface implemented by the entity > > Using a specific interface implemented by the entity: > `o.h.s.spatial.Coordinates` > > @Indexed @Spatial > public class Address { > @Field String city; > > public double getLatitude() { ... } > public double getLongitude() { ... } > } > > ### Use JTS's Point type > > Use `Point` as the spatial property type. > > ### (maybe) `double` hosted by two unrelated properties > > The problem is to find a nice way to bind these properties to the spatial > data. > > ## How is geoloc data indexed > > There will be two strategies > > - index latitude and longitude as is and do a AND query matching both. This > typically works nicely for small datasets. > - index latitude and longitude as matching a 15 level grid (from most generic > to most specific). this typically works nicely for big datasets > > ## Query DSL > > We have worked to make a fluent spatial API to the current query DSL. Before > we go on implementing it, we would like your feedback. Some points remains > open. > > ### General overview > > builder.spatial() > .scoreByProximity() //not implemented yet > .onField("coord") > .boostedTo(2) > .within(2).km() > .of( coordinates ) > .createQuery(); > > ### onField > > onField is not a good name. It has a slightly meaning than when it's used in > range().onField(). We need to find a better name: > > - onField > - onGrid > - onCoordinates > - onLocation > > This really represents the metadata set where the location will be stored. In > the boolean approach, we store latitude and longitude. In the grid approach, > we store latitude, > longitude and the set of grids coordinates belong to. > > .onField() does accept a field name which can be the `Coordinates` property > or the virtual field used by the class-level bridge (if lat and long are top > level properties). > > When latitude and longitude are independent properties, we would use > > builder. > .onLatitudeField("lat") > .andLongitudeField("lat") > > ### Surface checked > > #### Option 1: centeredOn > > .centeredOn(double, double) > //or > .centeredOn() > .latitude(double) > .longitude(double) > //or > .centeredOn(SpatialIndexable) > .centeredOn(JTS.Point) // hard dependency on JTS even for non spatial > users :( > .centeredOn(Object) //? to avoid JTS dep > > - Should we have a version accepting Object? > - What is best, centeredOn(double, double) or > centeredOn().latitude(double).longitude(double)? > > #### Option 2: in / within > > > //query within circle > b.spatial() > .onField("coord") > .within(2).km() > .of(SpatialIndexable) > > .within(2).km() > .of() > .latitude() > .longitude() > .createQuery() > > //or with a different unit handling > > //query within circle > b.spatial() > .onField("coord") > .within(2, Unit.km) > .of(SpatialIndexable) > > .within(2, Unit.km) > .of() > .latitude() > .longitude() > .createQuery() > > My reason to support units is that a. it's explicit and b. when those > geosuckers improve, we could support time units like mins or hours. Note, > that's a very hard problem to crack and solutions are resource intensive and > not very accurate. None really do it correctly, not Google for sure. > > > We could support rectangles / boxes if really needed > > //query in box > b.spatial() > .onField("coord") > .inBox() > .from() > .to() > .createQuery(); > > //more formal but more correct wrt projection > b.spatial() > .onField("coord") > .inBox() > .withUpperLeft() > .withLowerRight() > .createQuery(); > > > Please give us your feedback. > > ## TODOs > > - Implement fluent DSL > - Implement Special interface implemented by the entity > - Implement Use JTS's Point type > - Implement bridge that supports indexing for boolean queries based on lat > and long instead of the grid. > - Implement @Spatial as a marker annotation for the spatial bridge > - Implement variable score based on proximity > Today we use constant score, ie in = 1, out = 0. We can think about a score > that goes from 1 to 0 based on the distance from the center to the circle size > We can imagine queries that should return close elements above far elements. > Note we might need a score going from 1 to .5 or some other value. Need to > think about that. > - Write how to focused doc > - Write doc on perf comparing grid vs boolean queries > - Convert to JBoss logging > - Add unit test using faceting + spatial queries > > Emmanuel > _______________________________________________ > hibernate-dev mailing list > hibernate-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/hibernate-dev _______________________________________________ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev