Why don't you just create two data sources that each wrap the ParquetFormat using a HadoopInputFormat and join them as for example done in the TPCH Q3 example [1]
I always found the MultipleInputFormat to be an ugly workaround for Hadoop's deficiency to read data from multiple sources. AFAIK, Hadoop's MultipleInputFormat does not provide data colocation that a join could exploit. Or is there any other beneficial property that I am not aware of? [1] https://github.com/apache/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/TPCHQuery3.java 2015-01-17 20:15 GMT+01:00 Felix Neutatz <neut...@googlemail.com>: > Hi, > > is there any example which shows how I can load several files with > different Hadoop input formats at once? My use case is that I want to load > two tables (in Parquet format) via Hadoop and join them within Flink. > > Best regards, > > Felix >