Yeah, then in this case maybe you can install JDK / Yourkit in the remote machines and run the tools over X or something. I'm assuming this is a development cluster (not live / production) and that installing debugging tools and running remote UIs etc is not a problem. :)
On Thu, Mar 5, 2015 at 1:52 PM, Andrew Xor <[email protected]> wrote: > Nathan I think that if he wants to profile a bolt per se that runs in a > worker that resides in a different cluster node than the one the profiling > tool runs he won't be able to attach the process since it resides in a > different physical machine, me thinks (well, now that I think of it better > it can be done... via remote debugging but that's just a pain in the ***). > > Regards, > > A. > > On Thu, Mar 5, 2015 at 8:46 PM, Nathan Leung <[email protected]> wrote: > >> You don't need to change your code. As Andrew mentioned you can get a lot >> of mileage by profiling your logic in a standalone program. For jvisualvm, >> you can just run your program (a loop that runs for a long time is best) >> then attach to the running process with jvisualvm. It's pretty >> straightforward to use and you can also find good guides with a Google >> search. >> On Mar 5, 2015 1:43 PM, "Andrew Xor" <[email protected]> wrote: >> >>> >>> Well... detecting memory leaks in Java is a bit tricky as Java does a >>> lot for you. Generally though, as long as you avoid using "new" operator >>> and close any resources that you do not use you should be fine... but a >>> Profiler such as the ones mentioned by Nathan will tell you the whole >>> truth. YourKit is awesome and has a free trial, go ahead and test drive it. >>> I am pretty sure that you need a working jar (or compilable code that has a >>> main function in it) in order to profile it, although if you want to >>> profile your bolts and spouts is a bit tricker. Hopefully your algorithm >>> (or portions of it) can be put in a sample test program that is able to be >>> executed locally for you to profile it. >>> >>> Hope this helped. Regards, >>> >>> A. >>> >>> >>> On Thu, Mar 5, 2015 at 8:33 PM, Sa Li <[email protected]> wrote: >>> >>>> >>>> On Thu, Mar 5, 2015 at 10:26 AM, Andrew Xor < >>>> [email protected]> wrote: >>>> >>>>> Unfortunately that is not fixed, it depends on the computations and >>>>> data-structures you have; in my case for example I use more than 2GB since >>>>> I need to keep a large matrix in memory... having said that, in most cases >>>>> it should be relatively easy to estimate how much memory you are going to >>>>> need and use that... or if that's not possible you can just increase it >>>>> and >>>>> try the "set and see" approach. Check for memory leaks as well... >>>>> (unclosed >>>>> resources and so on...!) >>>>> >>>>> Regards. >>>>> >>>>> A. >>>>> >>>>> On Thu, Mar 5, 2015 at 8:21 PM, Sa Li <[email protected]> wrote: >>>>> >>>>>> Thanks, Nathan. How much is should be in general? >>>>>> >>>>>> On Thu, Mar 5, 2015 at 10:15 AM, Nathan Leung <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Your worker is allocated a maximum of 768mb of heap. It's quite >>>>>>> possible that this is not enough. Try increasing Xmx i worker.childopts. >>>>>>> On Mar 5, 2015 1:10 PM, "Sa Li" <[email protected]> wrote: >>>>>>> >>>>>>>> Hi, All >>>>>>>> >>>>>>>> I have been running a trident topology on production server, code >>>>>>>> is like this: >>>>>>>> >>>>>>>> topology.newStream("spoutInit", kafkaSpout) >>>>>>>> .each(new Fields("str"), >>>>>>>> new JsonObjectParse(), >>>>>>>> new Fields("eventType", "event")) >>>>>>>> .parallelismHint(pHint) >>>>>>>> .groupBy(new Fields("event")) >>>>>>>> >>>>>>>> .persistentAggregate(PostgresqlState.newFactory(config), new >>>>>>>> Fields("eventType"), new EventUpdater(), new Fields("eventWord")) >>>>>>>> ; >>>>>>>> >>>>>>>> Config conf = new Config(); >>>>>>>> conf.registerMetricsConsumer(LoggingMetricsConsumer.class, 1); >>>>>>>> >>>>>>>> Basically, it does simple things to get data from kafka, parse to >>>>>>>> different field and write into postgresDB. But in storm UI, I did see >>>>>>>> such error, "java.lang.OutOfMemoryError: GC overhead limit exceeded". >>>>>>>> It all happens in same worker of each node - 6703. I understand this >>>>>>>> is because by default the JVM is configured to throw this error if you >>>>>>>> are spending more than *98% of the total time in GC and after the GC >>>>>>>> less than 2% of the heap is recovered*. >>>>>>>> >>>>>>>> I am not sure what is exact cause for memory leak, is it OK by simply >>>>>>>> increase the heap? Here is my storm.yaml: >>>>>>>> >>>>>>>> supervisor.slots.ports: >>>>>>>> >>>>>>>> - 6700 >>>>>>>> >>>>>>>> - 6701 >>>>>>>> >>>>>>>> - 6702 >>>>>>>> >>>>>>>> - 6703 >>>>>>>> >>>>>>>> nimbus.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true" >>>>>>>> >>>>>>>> ui.childopts: "-Xmx768m -Djava.net.preferIPv4Stack=true" >>>>>>>> >>>>>>>> supervisor.childopts: "-Djava.net.preferIPv4Stack=true" >>>>>>>> >>>>>>>> worker.childopts: "-Xmx768m -Djava.net.preferIPv4Stack=true" >>>>>>>> >>>>>>>> >>>>>>>> Anyone has similar issues, and what will be the best way to >>>>>>>> overcome? >>>>>>>> >>>>>>>> >>>>>>>> thanks in advance >>>>>>>> >>>>>>>> AL >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>> >>>> >
