​
Well...  detecting memory leaks in Java is a bit tricky as Java does a lot
for you. Generally though, as long as you avoid using "new" operator and
close any resources that you do not use you should be fine... but a
Profiler such as the ones mentioned by Nathan will tell you the whole
truth. YourKit is awesome and has a free trial, go ahead and test drive it.
I am pretty sure that you need a working jar (or compilable code that has a
main function in it) in order to profile it, although if you want to
profile your bolts and spouts is a bit tricker. Hopefully your algorithm
(or portions of it) can be put in a sample test program that is able to be
executed locally for you to profile it.

Hope this helped. Regards,

A.
​

On Thu, Mar 5, 2015 at 8:33 PM, Sa Li <[email protected]> wrote:

>
> On Thu, Mar 5, 2015 at 10:26 AM, Andrew Xor <[email protected]>
> wrote:
>
>> Unfortunately that is not fixed, it depends on the computations and
>> data-structures you have; in my case for example I use more than 2GB since
>> I need to keep a large matrix in memory... having said that, in most cases
>> it should be relatively easy to estimate how much memory you are going to
>> need and use that... or if that's not possible you can just increase it and
>> try the "set and see" approach. Check for memory leaks as well... (unclosed
>> resources and so on...!)
>>
>> Regards.
>>
>> ​A.​
>>
>> On Thu, Mar 5, 2015 at 8:21 PM, Sa Li <[email protected]> wrote:
>>
>>> Thanks, Nathan. How much is should be in general?
>>>
>>> On Thu, Mar 5, 2015 at 10:15 AM, Nathan Leung <[email protected]> wrote:
>>>
>>>> Your worker is allocated a maximum of 768mb of heap. It's quite
>>>> possible that this is not enough. Try increasing Xmx i worker.childopts.
>>>> On Mar 5, 2015 1:10 PM, "Sa Li" <[email protected]> wrote:
>>>>
>>>>> Hi, All
>>>>>
>>>>> I have been running a trident topology on production server, code is
>>>>> like this:
>>>>>
>>>>> topology.newStream("spoutInit", kafkaSpout)
>>>>>                 .each(new Fields("str"),
>>>>>                         new JsonObjectParse(),
>>>>>                         new Fields("eventType", "event"))
>>>>>                 .parallelismHint(pHint)
>>>>>                 .groupBy(new Fields("event"))
>>>>>                 .persistentAggregate(PostgresqlState.newFactory(config), 
>>>>> new Fields("eventType"), new EventUpdater(), new Fields("eventWord"))
>>>>>         ;
>>>>>
>>>>>         Config conf = new Config();
>>>>>         conf.registerMetricsConsumer(LoggingMetricsConsumer.class, 1);
>>>>>
>>>>> Basically, it does simple things to get data from kafka, parse to 
>>>>> different field and write into postgresDB. But in storm UI, I did see 
>>>>> such error, "java.lang.OutOfMemoryError: GC overhead limit exceeded". It 
>>>>> all happens in same worker of each node - 6703. I understand this is 
>>>>> because by default the JVM is configured to throw this error if you are 
>>>>> spending more than *98% of the total time in GC and after the GC less 
>>>>> than 2% of the heap is recovered*.
>>>>>
>>>>> I am not sure what is exact cause for memory leak, is it OK by simply 
>>>>> increase the heap? Here is my storm.yaml:
>>>>>
>>>>> supervisor.slots.ports:
>>>>>
>>>>>      - 6700
>>>>>
>>>>>      - 6701
>>>>>
>>>>>      - 6702
>>>>>
>>>>>      - 6703
>>>>>
>>>>> nimbus.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true"
>>>>>
>>>>> ui.childopts: "-Xmx768m -Djava.net.preferIPv4Stack=true"
>>>>>
>>>>> supervisor.childopts: "-Djava.net.preferIPv4Stack=true"
>>>>>
>>>>> worker.childopts: "-Xmx768m -Djava.net.preferIPv4Stack=true"
>>>>>
>>>>>
>>>>> Anyone has similar issues, and what will be the best way to overcome?
>>>>>
>>>>>
>>>>> thanks in advance
>>>>>
>>>>> AL
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>
>

Reply via email to