Hello,

pg2arrow [*1] has '--dump' mode to print out schema definition of the
given Apache Arrow file.
Does it make sense for you?

$ ./pg2arrow --dump ~/hoge.arrow
[Footer]
{Footer: version=V4, schema={Schema: endianness=little,
fields=[{Field: name="id", nullable=true, type={Int32}, children=[],
custom_metadata=[]}, {Field: name="a", nullable=true, type={Float64},
children=[], custom_metadata=[]}, {Field: name="b", nullable=true,
type={Decimal: precision=11, scale=7}, children=[],
custom_metadata=[]}, {Field: name="c", nullable=true, type={Struct},
children=[{Field: name="x", nullable=true, type={Int32}, children=[],
custom_metadata=[]}, {Field: name="y", nullable=true, type={Float32},
children=[], custom_metadata=[]}, {Field: name="z", nullable=true,
type={Utf8}, children=[], custom_metadata=[]}], custom_metadata=[]},
{Field: name="d", nullable=true, type={Utf8},
dictionary={DictionaryEncoding: id=0, indexType={Int32},
isOrdered=false}, children=[], custom_metadata=[]}, {Field: name="e",
nullable=true, type={Timestamp: unit=us}, children=[],
custom_metadata=[]}, {Field: name="f", nullable=true, type={Utf8},
children=[], custom_metadata=[]}, {Field: name="random",
nullable=true, type={Float64}, children=[], custom_metadata=[]}],
custom_metadata=[{KeyValue: key="sql_command" value="SELECT *,random()
FROM t"}]}, dictionaries=[{Block: offset=920, metaDataLength=184
bodyLength=128}], recordBatches=[{Block: offset=1232,
metaDataLength=648 bodyLength=386112}]}
[Dictionary Batch 0]
{Block: offset=920, metaDataLength=184 bodyLength=128}
{Message: version=V4, body={DictionaryBatch: id=0, data={RecordBatch:
length=6, nodes=[{FieldNode: length=6, null_count=0}],
buffers=[{Buffer: offset=0, length=0}, {Buffer: offset=0, length=64},
{Buffer: offset=64, length=64}]}, isDelta=false}, bodyLength=128}
[Record Batch 0]
{Block: offset=1232, metaDataLength=648 bodyLength=386112}
{Message: version=V4, body={RecordBatch: length=3000,
nodes=[{FieldNode: length=3000, null_count=0}, {FieldNode:
length=3000, null_count=60}, {FieldNode: length=3000, null_count=62},
{FieldNode: length=3000, null_count=0}, {FieldNode: length=3000,
null_count=56}, {FieldNode: length=3000, null_count=66}, {FieldNode:
length=3000, null_count=0}, {FieldNode: length=3000, null_count=0},
{FieldNode: length=3000, null_count=64}, {FieldNode: length=3000,
null_count=0}, {FieldNode: length=3000, null_count=0}],
buffers=[{Buffer: offset=0, length=0}, {Buffer: offset=0,
length=12032}, {Buffer: offset=12032, length=384}, {Buffer:
offset=12416, length=24000}, {Buffer: offset=36416, length=384},
{Buffer: offset=36800, length=48000}, {Buffer: offset=84800,
length=0}, {Buffer: offset=84800, length=384}, {Buffer: offset=85184,
length=12032}, {Buffer: offset=97216, length=384}, {Buffer:
offset=97600, length=12032}, {Buffer: offset=109632, length=0},
{Buffer: offset=109632, length=12032}, {Buffer: offset=121664,
length=96000}, {Buffer: offset=217664, length=0}, {Buffer:
offset=217664, length=12032}, {Buffer: offset=229696, length=384},
{Buffer: offset=230080, length=24000}, {Buffer: offset=254080,
length=0}, {Buffer: offset=254080, length=12032}, {Buffer:
offset=266112, length=96000}, {Buffer: offset=362112, length=0},
{Buffer: offset=362112, length=24000}]}, bodyLength=386112}

[*1] https://heterodb.github.io/pg-strom/arrow_fdw/#using-pg2arrow

2019年12月7日(土) 6:26 Christian Hudon <chr...@elementai.com>:
>
> Hi,
>
> For the uses I would like to make of Arrow, I would need a human-readable
> and -writable version of an Arrow Schema, that could be converted to and
> from the Arrow Schema C++ object. Going through the doc for 0.15.1, I don't
> see anything to that effect, with the closest being the ToString() method
> on DataType instances, but which is meant for debugging only. (I need an
> expression of an Arrow Schema that people can read, and that can live
> outside of the code for a particular operation.)
>
> Is a text representation of an Arrow Schema something that is being worked
> on now? If not, would you folks be interested in me putting up an initial
> proposal for discussion? Any design constraints I should pay attention to,
> then?
>
> Thanks,
>
>   Christian
> --
>
>
> │ Christian Hudon
>
> │ Applied Research Scientist
>
>    Element AI, 6650 Saint-Urbain #500
>
>    Montréal, QC, H2S 3G9, Canada
>    Elementai.com



-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kai...@heterodb.com>

Reply via email to