Hi Ian,

Sounds like you have a total topic size of ~20GB (96 partitions x 200mb). If 
most keys are unique then group and reduce might not be as effective in 
grouping/reducing. Can you comment on the key distribution? Are most keys 
unique? Or do you expect lots of keys to be the same in the topic?

Thanks
Eno


> On Mar 10, 2017, at 9:05 AM, Ian Duffy <i...@ianduffy.ie> wrote:
> 
> Hi All,
> 
> I'm doing a groupBy and reduce on a kstream which results in a state store
> being created.
> 
> This state store is growing to be massive, its filled up a 20gb drive. This
> feels very unexpected. Is there some cleanup or flushing process for the
> state stores that I'm missing or is such a large size expected?
> 
> The topic in question has 96 partitions and the state is about ~200mb
> average for each one.
> 
> 175M 1_0
> 266M 1_1
> 164M 1_10
> 177M 1_11
> 142M 1_12
> 271M 1_13
> 158M 1_14
> 280M 1_15
> 286M 1_16
> 181M 1_17
> 185M 1_18
> 187M 1_19
> 281M 1_2
> 278M 1_20
> 188M 1_21
> 262M 1_22
> 166M 1_23
> 177M 1_24
> 268M 1_25
> 264M 1_26
> 147M 1_27
> 179M 1_28
> 276M 1_29
> 177M 1_3
> 157M 1_30
> 137M 1_31
> 247M 1_32
> 275M 1_33
> 169M 1_34
> 267M 1_35
> 283M 1_36
> 171M 1_37
> 166M 1_38
> 277M 1_39
> 160M 1_4
> 273M 1_40
> 278M 1_41
> 279M 1_42
> 170M 1_43
> 139M 1_44
> 272M 1_45
> 179M 1_46
> 283M 1_47
> 263M 1_48
> 267M 1_49
> 181M 1_5
> 282M 1_50
> 166M 1_51
> 161M 1_52
> 176M 1_53
> 152M 1_54
> 172M 1_55
> 148M 1_56
> 268M 1_57
> 144M 1_58
> 177M 1_59
> 271M 1_6
> 279M 1_60
> 266M 1_61
> 194M 1_62
> 177M 1_63
> 267M 1_64
> 177M 1_65
> 271M 1_66
> 175M 1_67
> 168M 1_68
> 140M 1_69
> 175M 1_7
> 173M 1_70
> 179M 1_71
> 178M 1_72
> 166M 1_73
> 180M 1_74
> 177M 1_75
> 276M 1_76
> 177M 1_77
> 162M 1_78
> 266M 1_79
> 194M 1_8
> 158M 1_80
> 187M 1_81
> 162M 1_82
> 163M 1_83
> 177M 1_84
> 286M 1_85
> 165M 1_86
> 171M 1_87
> 162M 1_88
> 179M 1_89
> 145M 1_9
> 166M 1_90
> 190M 1_91
> 159M 1_92
> 284M 1_93
> 172M 1_94
> 149M 1_95

Reply via email to