Re: Reducing runtime of Flink planner

2019-01-10 Thread Niklas Teichmann
Hi Fabian and Timo, Thanks for your answers! At the moment we're working at updating our project to Flink 1.7, so that we can check if the commit you wrote about solves the problem. The debugging we did so far seems to point to calcite as being responsible for the long planning times - we'r

Re: Reducing runtime of Flink planner

2019-01-10 Thread Fabian Hueske
Hi Niklas, The planning time of a job does not depend on the data size. It would be the same whether you process 5MB or 5PB. FLINK-10566 (as pointed to by Timo) fixed a problem for plans with many braching and joining nodes. Looking at your plan, there are some, but (IMO) not enough to be problem

Re: Reducing runtime of Flink planner

2019-01-07 Thread Timo Walther
Hi Niklas, it would be interesting to know which planner caused the long runtime. Could you use a debugger to figure out more details? Is it really the Flink Table API planner or the under DataSet planner one level deeper? There was an issue that was recently closed [1] about the DataSet opt

Reducing runtime of Flink planner

2019-01-07 Thread Niklas Teichmann
Hi everybody, I have a question concerning the planner for the Flink Table / Batch API. At the moment I try to use a library called Cypher for Apache Flink, a project that tries to implement the graph database query language Cypher on Apache Flink (CAPF, https://github.com/soerenreichardt/cyp