isaaccorley opened a new pull request, #479:
URL: https://github.com/apache/sedona-db/pull/479

   Adds support for reading GeoParquet files from Azure Blob Storage.
   
   ## Changes
   
   - New `azure` feature flag in `rust/sedona/Cargo.toml` enabling 
`object_store/azure`
   - `AzureOptions` struct supporting common auth methods:
     - `account_name`, `sas_token`, `access_key`
     - `bearer_token`, `client_id`, `client_secret`, `tenant_id`, `authority_id`
   - URL scheme support for `az://`, `abfs://`, `abfss://`
   - Fixed GeoParquet metadata parsing for files missing `geometry_types` field 
(e.g., MS Building Footprints)
   - Fixed URL extension detection to strip query params before checking file 
type
   
   ## Motivation
   
   Wanted to query Microsoft Planetary Computer datasets (MS Building 
Footprints, etc.) directly from SedonaDB. These are hosted on Azure Blob 
Storage and use SAS token auth.
   
   ## Usage
   
   ```python
   import sedonadb
   
   sd = sedonadb.connect()
   df = sd.read_parquet(
       "abfss://[email protected]/path/",
       options={
           'azure.account_name': 'blobstorage',
           'azure.sas_token': 'sv=2023-01-03&st=...'
       }
   )
   ```
   
   ## Testing
   
   - Ran `pre-commit run --all-files` 
   - `cargo clippy --workspace --all-targets --all-features -- -Dwarnings`
   - `cargo test -p sedona -p sedona-geoparquet --all-features` (86 tests pass)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to