You want something like this (didn't test this, but hopefully gives
you the idea) to make the "my_array" node:

auto element = PrimitiveNode::Make("element", Repetition::OPTIONAL,
Type::INT32);
auto list = GroupNode::Make("list", Repetition::REPEATED, {element});
auto my_array = GroupNode::Make("my_array", Repetition::REQUIRED,
{list}, LogicalType::LIST);

- Wes

On Fri, Dec 8, 2017 at 9:34 AM, Renato Marroquín Mogrovejo
<renatoj.marroq...@gmail.com> wrote:
> Hi devs,
>
> I am trying to create a parquet file that contains an array on int32 for
> each record.
> The schema I am trying to implement is as follows:
>
> required arr_schema {
>    required int32 id;
>    required group my_array (LIST) {
>       repeated group list {
>          optional int32 element;
>       }
>    }
> }
>
> I guess I have to create GroupNodes and assigned to them the inner
> elements. Something like the code snipped above. But then for writting? how
> can I accomplish this?
>
> fields.push_back(PrimitiveNode::Make("int32_field", Repetition::REPEATED,
> Type::INT32, LogicalType::NONE));
> auto list_field = GroupNode::Make("some_array", Repetition::REQUIRED,
> fields);
>
> I also saw the logical type LIST defined in
> https://github.com/apache/parquet-cpp/blob/master/src/parquet/types.cc#L163,
> but I don't know how to use it.
> What I want at the end is to read such generated files from Amazon
> Athena/Presto.
> Any pointers or help are highly appreciated.
> Thanks!

Reply via email to