Hi vino,

thanks. 

I was running a join operation on two DataSets and writing the result to disk 
and the results were correct.
I just was not able to identify the moment when the Hashtable is built. 
(HashPartition.java is not used in this case?)

Do you have an idea why I cannot find it?


Here is a part of my code:
DataSet<RowCustomers> customers = env.fromElements( new RowCustomers(1, 
„Mayer“));
tEnv.registerDataSet("Customers", customers, "customerID, customerName");
DataSet<RowOrders> orders = env.fromElements( new RowOrders(1, 1, new 
String[]{„Pen“, „Paper"}, "22.08.2018"));
tEnv.registerDataSet("Orders", orders, "orderID, customerID, items, date");
DataSet<Tuple2 <RowCustomers, RowOrders>> result = customers.join(orders, 
JoinOperatorBase.JoinHint.REPARTITION_HASH_FIRST)
   .where("customerID").equalTo("customerID");
result.writeAsText(„/“);

Thanks a lot.
Benjamin


> Am 11.09.2018 um 04:24 schrieb vino yang <yanghua1...@gmail.com>:
> 
> Hi Benjamin,
> 
> The approximate location is this package, the more accurate location is 
> here.[1]
> 
> Specifically, Hash Join is divided into two steps:
> 
> 1) build side
> 2) probe side
> 
> Thanks ,vino.
> 
> [1]: 
> https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/operators/hash/HashPartition.java
>  
> <https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/operators/hash/HashPartition.java>
> Benjamin Burkhardt <benjamin.burkha...@cs.tu-darmstadt.de 
> <mailto:benjamin.burkha...@cs.tu-darmstadt.de>> 于2018年9月10日周一 下午10:09写道:
> Hi,
> 
> can anyone tell me where the default hybrid hash join function for 
> partitioning (shuffle phase) is implemented? 
> Even after deeper dinning I was not able to figure out where it is located.
> 
> Might be somewhere here?
> —> 
> https://github.com/apache/flink/tree/master/flink-runtime/src/main/java/org/apache/flink/runtime/operators/hash
>  
> <https://github.com/apache/flink/tree/master/flink-runtime/src/main/java/org/apache/flink/runtime/operators/hash>
> 
> Thanks in advance.
> 
> Benjamin

Reply via email to