subject:"How to use org.apache.hadoop.mapreduce.lib.input.MultipleInputs in Flink"

Re: How to use org.apache.hadoop.mapreduce.lib.input.MultipleInputs in Flink

2015-01-17 Thread Fabian Hueske

Why don't you just create two data sources that each wrap the ParquetFormat using a HadoopInputFormat and join them as for example done in the TPCH Q3 example [1] I always found the MultipleInputFormat to be an ugly workaround for Hadoop's deficiency to read data from multiple sources. AFAIK, Hado

How to use org.apache.hadoop.mapreduce.lib.input.MultipleInputs in Flink

2015-01-17 Thread Felix Neutatz

Hi, is there any example which shows how I can load several files with different Hadoop input formats at once? My use case is that I want to load two tables (in Parquet format) via Hadoop and join them within Flink. Best regards, Felix