Hi dev team, I am working with Riyafa on adding support for coordinate reference systems (CRS) to AsterixDB geometries. This additional information describes how data is projected from the physical space to geometry coordinates and is essential for many data science and GIS analytics projects. The way we plan to implement this is by having a central CRS dataset that we use. Each CRS will be identified by an integer Spatial Reference Identifier (SRID) that we will add to each geometry. This reduces the storage overhead and speeds up the retrieval of a geometry CRS. My question is about the best way of storing the CRS information in that central table. Here are our constraints and requirements.
1. This information should be highly-available. It will be accessed frequently by worker nodes during data processing. 2. The CRS table should be consistent across all machines so that SRID->CRS mapping is also consistent. 3. We might need to update the table occasionally, e.g., while loading data from an external source. This ensures that we parse the external CRS and use it appropriately. 4. The table is not expected to be super large. The standard CRS database contains less than 32,000 records and we might extend it occasionally from external sources. 5. The CRS table is durable and will need to be loaded back upon system restart. 6. There is a serialized form for CRS, but it will be way more efficient if we can keep the CRSes as Java objects to reduce the parsing overhead. Do you have any recommendations on the best way of storing such a table? Is the catalog the right place to keep this information? Ahmed Eldawy <http://www.cs.ucr.edu/~eldawy> <https://star.cs.ucr.edu> Tel: +1 (951) 827-5654
