I concur. +1

On Tue, Apr 1, 2025 at 8:52 PM Weston Pace <weston.p...@gmail.com> wrote:

> I've written a draft at [1] but for simplicity's sake I will include the
> text of the proposal inline below.
>
> [1] https://github.com/westonpace/arrow/tree/feat/turtle-extension-type
>
> TURTLE
> ======
>
> * Extension name: ``arrow.turtle``.
>
> * The storage type of the extension is ``Struct`` where the struct array is
>   composed of the following fields:
>
>   * **label: String** = A label for this particular batch.
>   * **value: Binary** = A record batch serialized using the Arrow IPC
> streaming
>   format.  The bytes should contain valid Arrow IPC bytes which can be
> deserialized
>   as if it were an independent buffer or file.  The batch should conform to
> the
>   schema encoded in the ``schema`` parameter.
>
> * Extension type parameters:
>
>   * **schema** = the schema of the record batches, serialized using the IPC
>   streaming format and encoded into JSON with base64.  All records in the
>   array must conform to this schema.
>
> * Description of the serialization:
>
>   The metadata must be a valid JSON object with the ``schema`` field.  The
>   schema field should be a base64 encoded JSON string as described above.
>
> Rationale
> ---------
>
> Tabular data is a common approach for recording measurements and
> observations.
> The columns represent different measurements and the rows represent
> "events"
> or "samples" that have been taken.  For example, a weather station may
> record
> the temperature, pressure, and wind speed every hour.
>
> With the introduction of quantum computing, we now must consider the case
> where
> each event is a superposition of multiple states and we need to record all
> possible states.  As a simplification we can think of each element in the
> array as a measurement made in a separate but parallel universe.
>
> The ``Label`` field can be used to give a human-readable label to the
> various
> universes or states being measured.  Alternatively, if there is no
> meaningful
> label, it can be an empty string.
>
> Following this approach we arrive at a three dimensional tabular
> structure.  However,
> there is no reason that we must stop at three dimensions.  The batch can
> contain
> additional turtle fields to encode an arbitrary number of additional
> dimensions.
>
> Etymology
> ---------
>
> The name ``Turtle`` comes from the scientific discovery of the world turtle
> upon
> which our universe rests.  It is a well known fact that the world turtle
> itself
> rests upon the back of another turtle, which is supported by a series of
> ever larger
> turtles.  This real life recursive structure seemed like a good fit for
> representing
> the recursive nature of this extension type.
>

Reply via email to