Hi, All
I have been running a trident topology on production server, code is like
this:
topology.newStream("spoutInit", kafkaSpout)
.each(new Fields("str"),
new JsonObjectParse(),
new Fields("eventType", "event"))
.parallelismHint(pHint)
.groupBy(new Fields("event"))
.persistentAggregate(PostgresqlState.newFactory(config), new
Fields("eventType"), new EventUpdater(), new Fields("eventWord"))
;
Config conf = new Config();
conf.registerMetricsConsumer(LoggingMetricsConsumer.class, 1);
Basically, it does simple things to get data from kafka, parse to
different field and write into postgresDB. But in storm UI, I did see
such error, "java.lang.OutOfMemoryError: GC overhead limit exceeded".
It all happens in same worker of each node - 6703. I understand this
is because by default the JVM is configured to throw this error if you
are spending more than *98% of the total time in GC and after the GC
less than 2% of the heap is recovered*.
I am not sure what is exact cause for memory leak, is it OK by simply
increase the heap? Here is my storm.yaml:
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
nimbus.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true"
ui.childopts: "-Xmx768m -Djava.net.preferIPv4Stack=true"
supervisor.childopts: "-Djava.net.preferIPv4Stack=true"
worker.childopts: "-Xmx768m -Djava.net.preferIPv4Stack=true"
Anyone has similar issues, and what will be the best way to overcome?
thanks in advance
AL