+Ajit Kharpude <ajit.kharp...@clouzersolutions.com> On Fri, Feb 23, 2024 at 1:14 PM Karsten Stöckmann < karsten.stoeckm...@gmail.com> wrote:
> Hi, > > I am observing somewhat unexpected (from my point of view) behaviour > while ke-key / re-partitioning operations in order to prepare a > KTable-KTable join. > > Assume two (simplified) source data structures from two respective topics: > > class User { > Long id; // PK > String name; > } > > class Attribute { > Long id; // PK > Integer number; > Long user_id; // FK > } > > Now in order to build an aggregate user containing all of its > attributes (0-n), the 'attributes' topic needs to be re-keyed to its > FK ('native' FK join is not possible as there's no right join > operation) using a collection object. > > class GroupedAttributes { > List<Integer> numbers = new ArrayList<>(); > public GroupedAttributes add(Integer v) { > numbers.add(v); > return this; > } > public GroupedAttributes remove(Integer v) { > numbers.remove(v); > return this; > } > } > > Re-Key operation: > > KTable<Long, GroupedAttributes> groupedAttributes = attributes // this > is a KTable<Long, Attribute> > .groupBy( > (k, v) -> KeyValue.pair(v.userId(), v.number()), > Grouped.with( > "attributes-grouped", > Serdes.Long(), > Serdes.Integer())) > .aggregate( > GroupedAttributes::new, > (k, v, a) -> a.add(v), > (k, v, a) -> a.remove(v), > Named.as("attributes-grouped-aggregated"), > Materialized.with(Serdes.Long, groupedAttributesSerde)); > > This internally creates a state store and associated topic > 'attributes-grouped-aggregated-changelog' containing the aggregated > 'number' attributes re-keyed to their FK (user_id). > > Now for a User associated with exactly one Attribute, I'd expected the > topic to contain exactly one record with the user's key and a > GroupedAttributes object with one item. But: in fact that topic > contains thousands of records for that particular user with an ever > growing list of always the same attribute 'number', which is > eventually reduced to the (expected) final object with one attribute > 'number'. > > E.g.: > > offset: 1, key: 100, { "numbers": [1] } > offset: 3, key: 100, { "numbers": [1, 1] } > offset: 6, key: 100, { "numbers": [1, 1, 1] } > offset: 9, key: 100, { "numbers": [1, 1, 1, 1] } > ... > offset 262211, key: 100, { "numbers": [1, 1] } > offset 262213, key: 100, { "numbers": [1] } > > Can anyone please shed some light on the internal workings and explain > if this is expected behaviour? > > Best wishes, > Karsten > -- Thanks & Regards *VIKRAM S SINGH*