This seems to point the way (use a different classloader): http://stackoverflow.com/questions/252893/how-do-you-change-the-classpath-within-java
So we could try something like use the system class loader, then try the dynamically-defined class loader if class not found D On Tue, Mar 1, 2011 at 10:07 AM, Alan Gates <[email protected]> wrote: > IIRC Java won't let you update the classpath on the fly (for security > reasons I think). But giving a better error message would definitely be > good. > > Alan. > > > On Mar 1, 2011, at 10:04 AM, Dmitriy Ryaboy wrote: > > patches accepted :-) >> >> D >> >> On Tue, Mar 1, 2011 at 10:02 AM, Dan Brickley <[email protected]> wrote: >> >> On 1 March 2011 17:56, Dmitriy Ryaboy <[email protected]> wrote: >>> >>>> Hi Dan, >>>> iirc, registering a jar does not put it on the Pig client classpath, it >>>> >>> just >>> >>>> tells Pig to ship the jar. You want to put it on the PIG_CLASSPATH >>>> before >>>> you invoke pig. >>>> >>> >>> Perfect, that was exactly it. It's running now :) >>> >>> Would it make sense for REGISTER to augment the classpath? Or maybe >>> better, for the error message to mention the role of PIG_CLASSPATH? >>> >>> cheers, >>> >>> Dan >>> >>> On Tue, Mar 1, 2011 at 5:57 AM, Dan Brickley <[email protected]> wrote: >>>> >>>>> >>>>> I'm trying to use InvokeForString to call a simple static method that >>>>> wraps >>>>> http://mzsanford.github.com/twitter-text-java/docs/api/index.html >>>>> https://github.com/twitter/twitter-text-java ... specifically the >>>>> Extractor class extractURLs method. In fact since the logical result >>>>> is a list of URLs perhaps I should be writing proper Pig-centric >>>>> wrapper that returns a tuple, but for now I thought a stringified list >>>>> would be ok for my immediate purposes. That purpose being pulling out >>>>> all the URLs from a corpus of tweets, so we can expand the bit.ly and >>>>> other short urls... >>>>> >>>>> So - I built the extra class (src below) and packaged it inside the >>>>> twitter-text jar, and verify it's in there and usable as follows: >>>>> >>>>> danbri$ java -cp >>>>> twitter-text-1.3.1-plus-tv.notube.TwitterExtractor.jar >>>>> tv.notube.TwitterExtractor "hello http://example.com/ >>>>> http://example.org/ world" >>>>> URLs: [http://example.com/, http://example.org/] >>>>> >>>>> Then from the same directory, I try run this as a Pig job: >>>>> >>>>> tw06 = load '/user/danbri/twitter/tweets2009-06.tab.txt.lzo' AS ( >>>>> when: chararray, who: chararray, msg: chararray); >>>>> REGISTER twitter-text-1.3.1-plus-tv.notube.TwitterExtractor.jar; >>>>> DEFINE ExtractURLs InvokeForString('tv.notube.TwitterExtractor.urls', >>>>> 'String'); >>>>> urls = FOREACH tw06 GENERATE ExtractURLs(msg); >>>>> x = SAMPLE urls 0.001; >>>>> dump x; >>>>> >>>>> ...but we don't get past InvokeForString, >>>>> >>>>> 2011-03-01 14:50:31,033 [main] ERROR org.apache.pig.tools.grunt.Grunt >>>>> - ERROR 1000: Error during parsing. could not instantiate >>>>> 'InvokeForString' with arguments '[tv.notube.TwitterExtractor.urls, >>>>> String]' >>>>> Details at logfile: /home/danbri/twitter/pig_1298987430385.log >>>>> ...-> >>>>> Caused by: java.lang.reflect.InvocationTargetException >>>>> Caused by: java.lang.ClassNotFoundException: tv.notube.TwitterExtractor >>>>> >>>>> I checked that Pig is finding the jar by mis-spelling the filename in >>>>> the "REGISTER" line (which as expected causes things to fail earlier). >>>>> Also double-check that the class is in the jar, >>>>> danbri$ jar -tvf >>>>> twitter-text-1.3.1-plus-tv.notube.TwitterExtractor.jar | grep tv >>>>> 0 Tue Mar 01 12:03:04 CET 2011 tv/ >>>>> 0 Tue Mar 01 12:03:04 CET 2011 tv/notube/ >>>>> 1114 Tue Mar 01 13:40:30 CET 2011 tv/notube/TwitterExtractor.class >>>>> >>>>> ...so I'm finding myself stuck. I'm sure the answer is staring me in >>>>> the face, but I can't see it. Perhaps I should just do things properly >>>>> with "extends EvalFunc<String>" and return the tuples separately >>>>> anyway... >>>>> >>>>> Thanks for any pointers, >>>>> >>>>> Dan >>>>> >>>>> >>>>> package tv.notube; >>>>> import com.twitter.Extractor; >>>>> import java.util.List; >>>>> class TwitterExtractor { >>>>> >>>>> public static void main (String[] args) { >>>>> String in = args[0]; >>>>> System.out.println("URLs: " + urls(in)); >>>>> } >>>>> >>>>> public static String urls(String tweet) { >>>>> Extractor ex = new Extractor(); >>>>> List urls = ex.extractURLs(tweet); >>>>> String o = urls.toString(); >>>>> return o; >>>>> } >>>>> } >>>>> >>>> >>>> >>>> >>> >
