that’s exactly what we need.
> 2021年9月30日 上午9:58,Jacques Nadeau 写道:
>
> I actually wonder if file formats should be an extension api so someone can
> implement a file format but it without any changes in Iceberg core (I don't
> think this is possible today). Let's say one wanted to create a pr
Wing, sorry, my earlier message probably misled you. I was speaking my
personal opinion on Flink version support.
On Tue, Sep 28, 2021 at 8:03 PM Wing Yew Poon
wrote:
> Hi OpenInx,
> I'm sorry I misunderstood the thinking of the Flink community. Thanks for
> the clarification.
> - Wing Yew
>
>
>
I actually wonder if file formats should be an extension api so someone can
implement a file format but it without any changes in Iceberg core (I don't
think this is possible today). Let's say one wanted to create a proprietary
format but use Iceberg semantics (not me). Could we make it such that o
Hi Ryan and Russell
Thanks very much for your response.
well, I want ACID and row level update capability that icegerg provides. I
believe data lake is a better way to manage our dataset, instead of hive.
I also want our transition from hive to data lake is as smooth as possible,
which means:
1
Youjun, what are you trying to do?
If you have existing tables in an incompatible format, you may just want to
leave them as they are for historical data. It depends on why you want to
use Iceberg. If you want to be able to query larger ranges of that data
because you've clustered across files by
Within Iceberg it would take a bit of effort, we would need custom readers
at the minimum if we just wanted to make it ReadOnly support. I think the
main complexity would be designing the specific readers for the platform
you want to use like "Spark" or "Flink", the actual metadata handling and
suc
thanks for the suggestion. we need to evaluate the cost to convert the format,
as those hive tables have been there for many years, so PB data need to
reformat.
also, do you think it is possible to develop the support for a new format? how
costly is it?
发自我的iPhone
> 在 2021年9月29日,下午9:34,Russe
There is no plan I am aware of using RCFiles directly in Iceberg. While we
could work to support other file formats, I don't think it is very widely used
compared to ORC and Parquet (Iceberg has native support for these formats).
My suggestion for conversion would be to do a CTAS statement in Sp
Hi community,
I am exploring ways to evolute existing hive tables (RCFile) into data lake.
However I found out that iceberg (or Hudi, delta lake) does not support RCFile.
So my questions are:
1, is there any plan (or is it possible) to support RCFile in the future? So we
can manage those exist