ch "application" (pandas, DataFusion, etc.) have their own implementation?
>
> Best regards,
> Elliot Morrison-Reed
>
> -Original Message-
> From: Andrew Lamb
> Sent: Saturday, January 6, 2024 8:22 AM
> To: dev@arrow.apache.org
> Subject: Re: [DISCUSS] Lin
AM
To: dev@arrow.apache.org
Subject: Re: [DISCUSS] Linear Formula Types
Hi Elliot,
Given your description, I agree extension types sound like they may be a good
idea, similar to geoarrow[1] for Geospatial data where there is extra
metadata[2] needed to interpret underlying types (e.g. factor
If the DB layer above Arrow supports it, I would define a (non-stored)
calculated column. Given celsius_percent between 0 and 1, I would
define fahrenheit as (32 + celsius_percent * 1.8). A good query
optimizer would convert the condition 'where fahrenheit > 122' into
'where celsius_percent > 0.5'.
Hi Elliot,
Given your description, I agree extension types sound like they may be a
good idea, similar to geoarrow[1] for Geospatial data where there is extra
metadata[2] needed to interpret underlying types (e.g. factor and offset)
Andrew
[1] https://github.com/geoarrow/geoarrow
[2] https://arr
Background
I have been looking into using parquet files for storing and working with
automotive data. One interesting thing about automotive data is that most
communication happens on the CAN bus where we have extremely limited bandwidth.
In order to encode "physical" values in a very space effici