paleolimbot opened a new pull request, #251:
URL: https://github.com/apache/sedona-db/pull/251
Work in progress!
This is intended to be a format that wraps GDAL/OGR, although the boiler
plate is applicable to reading various formats and/or implementing them in a
higher-level language like Python.
This is basically a watered down version of the DataFusion FileFormat that's
a bit easier to implement (at the expense of not supporting some features of
the file format). The basic idea is:
```rust
#[async_trait]
pub trait RecordBatchReaderFormatSpec: Debug + Send + Sync {
fn extension(&self) -> &str;
fn with_options(
&self,
options: &HashMap<String, String>,
) -> Result<Arc<dyn RecordBatchReaderFormatSpec>>;
async fn infer_schema(&self, location: &Object) -> Result<Schema>;
async fn infer_stats(&self, _location: &Object, table_schema: &Schema)
-> Result<Statistics>;
async fn open_reader(&self, args: &OpenReaderArgs)
-> Result<Box<dyn RecordBatchReader + Send>>;
}
```
(may change when we implement an actual format)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]