Nathan I think that if he wants to profile a bolt per se that runs in a
worker that resides in a different cluster node than the one the profiling
tool runs he won't be able to attach the process since it resides in a
different physical machine, me thinks (well, now that I think of it better
it can be done... via remote debugging but that's just a pain in the ***).

Regards,

A.

On Thu, Mar 5, 2015 at 8:46 PM, Nathan Leung <[email protected]> wrote:

> You don't need to change your code. As Andrew mentioned you can get a lot
> of mileage by profiling your logic in a standalone program. For jvisualvm,
> you can just run your program (a loop that runs for a long time is best)
> then attach to the running process with jvisualvm.  It's pretty
> straightforward to use and you can also find good guides with a Google
> search.
> On Mar 5, 2015 1:43 PM, "Andrew Xor" <[email protected]> wrote:
>
>> ​
>> Well...  detecting memory leaks in Java is a bit tricky as Java does a
>> lot for you. Generally though, as long as you avoid using "new" operator
>> and close any resources that you do not use you should be fine... but a
>> Profiler such as the ones mentioned by Nathan will tell you the whole
>> truth. YourKit is awesome and has a free trial, go ahead and test drive it.
>> I am pretty sure that you need a working jar (or compilable code that has a
>> main function in it) in order to profile it, although if you want to
>> profile your bolts and spouts is a bit tricker. Hopefully your algorithm
>> (or portions of it) can be put in a sample test program that is able to be
>> executed locally for you to profile it.
>>
>> Hope this helped. Regards,
>>
>> A.
>> ​
>>
>> On Thu, Mar 5, 2015 at 8:33 PM, Sa Li <[email protected]> wrote:
>>
>>>
>>> On Thu, Mar 5, 2015 at 10:26 AM, Andrew Xor <[email protected]
>>> > wrote:
>>>
>>>> Unfortunately that is not fixed, it depends on the computations and
>>>> data-structures you have; in my case for example I use more than 2GB since
>>>> I need to keep a large matrix in memory... having said that, in most cases
>>>> it should be relatively easy to estimate how much memory you are going to
>>>> need and use that... or if that's not possible you can just increase it and
>>>> try the "set and see" approach. Check for memory leaks as well... (unclosed
>>>> resources and so on...!)
>>>>
>>>> Regards.
>>>>
>>>> ​A.​
>>>>
>>>> On Thu, Mar 5, 2015 at 8:21 PM, Sa Li <[email protected]> wrote:
>>>>
>>>>> Thanks, Nathan. How much is should be in general?
>>>>>
>>>>> On Thu, Mar 5, 2015 at 10:15 AM, Nathan Leung <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Your worker is allocated a maximum of 768mb of heap. It's quite
>>>>>> possible that this is not enough. Try increasing Xmx i worker.childopts.
>>>>>> On Mar 5, 2015 1:10 PM, "Sa Li" <[email protected]> wrote:
>>>>>>
>>>>>>> Hi, All
>>>>>>>
>>>>>>> I have been running a trident topology on production server, code is
>>>>>>> like this:
>>>>>>>
>>>>>>> topology.newStream("spoutInit", kafkaSpout)
>>>>>>>                 .each(new Fields("str"),
>>>>>>>                         new JsonObjectParse(),
>>>>>>>                         new Fields("eventType", "event"))
>>>>>>>                 .parallelismHint(pHint)
>>>>>>>                 .groupBy(new Fields("event"))
>>>>>>>                 
>>>>>>> .persistentAggregate(PostgresqlState.newFactory(config), new 
>>>>>>> Fields("eventType"), new EventUpdater(), new Fields("eventWord"))
>>>>>>>         ;
>>>>>>>
>>>>>>>         Config conf = new Config();
>>>>>>>         conf.registerMetricsConsumer(LoggingMetricsConsumer.class, 1);
>>>>>>>
>>>>>>> Basically, it does simple things to get data from kafka, parse to 
>>>>>>> different field and write into postgresDB. But in storm UI, I did see 
>>>>>>> such error, "java.lang.OutOfMemoryError: GC overhead limit exceeded". 
>>>>>>> It all happens in same worker of each node - 6703. I understand this is 
>>>>>>> because by default the JVM is configured to throw this error if you are 
>>>>>>> spending more than *98% of the total time in GC and after the GC less 
>>>>>>> than 2% of the heap is recovered*.
>>>>>>>
>>>>>>> I am not sure what is exact cause for memory leak, is it OK by simply 
>>>>>>> increase the heap? Here is my storm.yaml:
>>>>>>>
>>>>>>> supervisor.slots.ports:
>>>>>>>
>>>>>>>      - 6700
>>>>>>>
>>>>>>>      - 6701
>>>>>>>
>>>>>>>      - 6702
>>>>>>>
>>>>>>>      - 6703
>>>>>>>
>>>>>>> nimbus.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true"
>>>>>>>
>>>>>>> ui.childopts: "-Xmx768m -Djava.net.preferIPv4Stack=true"
>>>>>>>
>>>>>>> supervisor.childopts: "-Djava.net.preferIPv4Stack=true"
>>>>>>>
>>>>>>> worker.childopts: "-Xmx768m -Djava.net.preferIPv4Stack=true"
>>>>>>>
>>>>>>>
>>>>>>> Anyone has similar issues, and what will be the best way to overcome?
>>>>>>>
>>>>>>>
>>>>>>> thanks in advance
>>>>>>>
>>>>>>> AL
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>

Reply via email to