Re: Different CoGroup behavior inside DeltaIteration

2015-11-16 Thread Duc Kien Truong
Hi, Thanks for the suggestion. I'm trying to use the delta iteration so that I can get the empty work set convergence criteria for free. But since doing an outer join between the work set and the solution set is not possible using cogroup, I will try to adapt my algorithm to use the bulk itera

Re: Different CoGroup behavior inside DeltaIteration

2015-11-16 Thread Stephan Ewen
It is actually very important that the co group in delta iterations works like that. If the CoGroup touched every element in the solution set, the "decreasing work" effect would not happen. The delta iterations are designed for cases where specific updates to the solution are made, driven by the w

Re: Different CoGroup behavior inside DeltaIteration

2015-11-16 Thread Fabian Hueske
Hi, this is an artifact of how the solution set is internally implemented. Usually, a CoGroup is executed using a sort-merge strategy, i.e., both input are sorted, merged, and handed to the CoGroup function in a streaming fashion. Both inputs are treated equally, and if one of both inputs does not

Different CoGroup behavior inside DeltaIteration

2015-11-15 Thread Truong Duc Kien
Hi, When running CoGroup between the solution set and a different dataset inside a DeltaIteration, the CoGroupFunction only get called for items that exist in the other dataset, simillar to an inner join. This is not the documented behavior for CoGroup: If a DataSet has a group with no matching