Hi Weston,

This is slightly off-topic, but I'm curious if what you mentioned about the
large metadata blocks (inlined below) also applies to IPC format?

I am working with matrices and representing them as tables that can have
hundreds of thousands of columns, but I'm splitting them into row groups to
apply push down predicates.

Finally, one other issue that comes into play, is the width of your
> data.  Really wide datasets (e.g. tens of thousands of columns) suffer
> from having rather large metadata blocks.  If your row groups start to
> get small then you end up spending a lot of time parsing metadata and
> much less time actually reading data.



Thanks!

> --

Aldrin Montana
Computer Science PhD Student
UC Santa Cruz

Reply via email to