Re: UDF problem: Java Heap space

2011-02-24 Thread Aniket Mokashi
Thanks everyone for helping me out, I figured it was one of those logical errors which lead to infinite loops. Actually indexof operation doesnt always return -1 on failure which was causing this to get into infinite loop (I should have thought about this). (ie. indexof('[', 187) would return 187 a

Re: UDF problem: Java Heap space

2011-02-24 Thread Aniket Mokashi
This is a map side udf. pig script loads a log file and grabs contents inside angle brackets. a = load; b = foreach a generate F(a); dump b; I see following on tasktrackers- 2011-02-23 18:01:25,992 INFO org.apache.pig.impl.util.SpillableMemoryManager: first memory handler call - Collection thresho

Re: UDF problem: Java Heap space

2011-02-24 Thread Daniel Dai
Hi, Aniket, What is your Pig script? Is the UDF in map side or reduce side? Daniel Dmitriy Ryaboy wrote: That's a max of 3.3K single-character strings. Even with the java overhead that shouldn't be more than a meg right? none of these should make it out of young gen assuming the list "cats" doe

Re: UDF problem: Java Heap space

2011-02-24 Thread Dmitriy Ryaboy
That's a max of 3.3K single-character strings. Even with the java overhead that shouldn't be more than a meg right? none of these should make it out of young gen assuming the list "cats" doesn't stick around outside the udf. On Thu, Feb 24, 2011 at 3:49 PM, Aniket Mokashi wrote: > Hi Jai, > > Tha

Re: UDF problem: Java Heap space

2011-02-24 Thread Aniket Mokashi
Hi Jai, Thanks for your email. I suspect that its the Strings in tight loop reason as you have suggested. I have a loop in my udf that does the following. while((startInd = someLog.indexOf('[',startInd)) > 0) { endInd = someLog.indexOf(']', startInd);

Re: UDF problem: Java Heap space

2011-02-24 Thread Jai Krishna
Sharing the code would be useful as mentioned. Also of help would the heap settings that the JVM had. However, off the top of my head, one common situation (esp. in text processing/tokenizing) is instantiating Strings in a tight loop. Besides you could also exercise your UDF in a local JVM and

Re: UDF problem: Java Heap space

2011-02-23 Thread Dmitriy Ryaboy
Aniket, share the code? It really depends on how you create them. -D On Wed, Feb 23, 2011 at 7:49 PM, Aniket Mokashi wrote: > I ve written a simple UDF that parses a chararray (which looks like > ...[a].[b]...[a]...) to capture stuff inside brackets and return them > as String a=2;b=1; and s

UDF problem: Java Heap space

2011-02-23 Thread Aniket Mokashi
I ve written a simple UDF that parses a chararray (which looks like ...[a].[b]...[a]...) to capture stuff inside brackets and return them as String a=2;b=1; and so on. The input chararray are rarely more than 1000 characters and are not more than 10 (I ve added log.warn in my udf to ensure