Hello Everyone,

I recently merged a longstanding branch and wanted to share the underlying idea.

Some time ago, I made a few changes to Apache Calcite to improve support for 
spatial data. Apache Calcite comes with a robust SQL parser, a query optimizer, 
and good abstractions to plug in any data sources.

My initial idea was to provide a faster solution than PostGIS for preparing map 
data. A memory-mapped file adapter plugged into Calcite sounded like a great 
solution, but this goal was probably a bit too ambitious. As an intermediate 
step, I created adapters for all the file data sources currently implemented in 
Baremaps (Shapefile, FlatGeobuf, GeoPackage, PostGIS, etc.), and it is now 
possible to use SQL as an abstraction to move data around (e.g., from files to 
PostGIS).

For instance, a Shapefile can now be imported into PostGIS as follows. In this 
case, the PostgresDdlExecutor uses PostgreSQL’s COPY API to ensure good 
performance, and most ST_ functions (reprojection, buffering, etc.) should 
work, as they are available in Calcite [2]. Pretty cool, right? ;-)

    // Setup Calcite connection properties
    Properties info = new Properties();
    info.setProperty("lex", "MYSQL");
    info.setProperty("caseSensitive", "false");
    info.setProperty("unquotedCasing", "TO_LOWER");
    info.setProperty("quotedCasing", "TO_LOWER");
    info.setProperty("parserFactory", PostgresDdlExecutor.class.getName() + 
"#PARSER_FACTORY");

    try (Connection connection = DriverManager.getConnection("jdbc:calcite:", 
info)) {
      CalciteConnection calciteConnection = 
connection.unwrap(CalciteConnection.class);
      SchemaPlus rootSchema = calciteConnection.getRootSchema();

      // Create a ShapefileTable instance
      ShapefileTable shapefileTable = new ShapefileTable(SAMPLE_SHAPEFILE);

      // Register the shapefile table in the Calcite schema
      rootSchema.add(SHAPEFILE_TABLE_NAME, shapefileTable);

      // Create a table in PostgreSQL by selecting from the shapefile table
      String createTableSql = "CREATE TABLE " + IMPORTED_TABLE_NAME + " AS " +
          "SELECT * FROM " + SHAPEFILE_TABLE_NAME;

      // Execute the DDL statement to create the table
      try (Statement statement = connection.createStatement()) {
        statement.execute(createTableSql);
      }
    }

All the import tasks of the workflows have been adapted to this new approach, 
and a few integration tests have been added to the codebase. However, expect 
these changes to introduce bugs or potential performance regressions in the 
main branch. They will be addressed before the next release.

Feel free to share your feedback or let me know if you have any cool use cases 
for this in mind. As mentioned, one of my upcoming goals is to leverage the 
memory-mapped data structures [2] implemented in the baremaps-data module to 
speed up workflow execution on large spatial datasets.

Wish you all a good week-end,

Bertil

[1] - 
https://calcite.apache.org/docs/reference.html#geometry-conversion-functions-2d
[2] - 
https://github.com/apache/incubator-baremaps/tree/main/baremaps-data/src/main/java/org/apache/baremaps/data/collection



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@baremaps.apache.org
For additional commands, e-mail: dev-h...@baremaps.apache.org

Reply via email to