This seems to point the way (use a different classloader):
http://stackoverflow.com/questions/252893/how-do-you-change-the-classpath-within-java

So we could try something like use the system class loader, then try the
dynamically-defined class loader if class not found

D

On Tue, Mar 1, 2011 at 10:07 AM, Alan Gates <[email protected]> wrote:

> IIRC Java won't let you update the classpath on the fly (for security
> reasons I think).  But giving a better error message would definitely be
> good.
>
> Alan.
>
>
> On Mar 1, 2011, at 10:04 AM, Dmitriy Ryaboy wrote:
>
>  patches accepted :-)
>>
>> D
>>
>> On Tue, Mar 1, 2011 at 10:02 AM, Dan Brickley <[email protected]> wrote:
>>
>>  On 1 March 2011 17:56, Dmitriy Ryaboy <[email protected]> wrote:
>>>
>>>> Hi Dan,
>>>> iirc, registering a jar does not put it on the Pig client classpath, it
>>>>
>>> just
>>>
>>>> tells Pig to ship the jar. You want to put it on the PIG_CLASSPATH
>>>> before
>>>> you invoke pig.
>>>>
>>>
>>> Perfect, that was exactly it. It's running now :)
>>>
>>> Would it make sense for REGISTER to augment the classpath? Or maybe
>>> better, for the error message to mention the role of PIG_CLASSPATH?
>>>
>>> cheers,
>>>
>>> Dan
>>>
>>>  On Tue, Mar 1, 2011 at 5:57 AM, Dan Brickley <[email protected]> wrote:
>>>>
>>>>>
>>>>> I'm trying to use InvokeForString to call a simple static method that
>>>>> wraps
>>>>> http://mzsanford.github.com/twitter-text-java/docs/api/index.html
>>>>> https://github.com/twitter/twitter-text-java ... specifically the
>>>>> Extractor class extractURLs method.  In fact since the logical result
>>>>> is a list of URLs perhaps I should be writing proper Pig-centric
>>>>> wrapper that returns a tuple, but for now I thought a stringified list
>>>>> would be ok for my immediate purposes. That purpose being pulling out
>>>>> all the URLs from a corpus of tweets, so we can expand the bit.ly and
>>>>> other short urls...
>>>>>
>>>>> So - I built the extra class (src below) and packaged it inside the
>>>>> twitter-text jar, and verify it's in there and usable as follows:
>>>>>
>>>>> danbri$ java -cp
>>>>> twitter-text-1.3.1-plus-tv.notube.TwitterExtractor.jar
>>>>> tv.notube.TwitterExtractor "hello http://example.com/
>>>>> http://example.org/ world"
>>>>> URLs: [http://example.com/, http://example.org/]
>>>>>
>>>>> Then from the same directory, I try run this as a Pig job:
>>>>>
>>>>> tw06 = load '/user/danbri/twitter/tweets2009-06.tab.txt.lzo' AS (
>>>>> when: chararray, who: chararray, msg: chararray);
>>>>> REGISTER twitter-text-1.3.1-plus-tv.notube.TwitterExtractor.jar;
>>>>> DEFINE ExtractURLs InvokeForString('tv.notube.TwitterExtractor.urls',
>>>>> 'String');
>>>>> urls = FOREACH tw06 GENERATE ExtractURLs(msg);
>>>>> x = SAMPLE urls 0.001;
>>>>> dump x;
>>>>>
>>>>> ...but we don't get past InvokeForString,
>>>>>
>>>>> 2011-03-01 14:50:31,033 [main] ERROR org.apache.pig.tools.grunt.Grunt
>>>>> - ERROR 1000: Error during parsing. could not instantiate
>>>>> 'InvokeForString' with arguments '[tv.notube.TwitterExtractor.urls,
>>>>> String]'
>>>>> Details at logfile: /home/danbri/twitter/pig_1298987430385.log
>>>>> ...->
>>>>> Caused by: java.lang.reflect.InvocationTargetException
>>>>> Caused by: java.lang.ClassNotFoundException: tv.notube.TwitterExtractor
>>>>>
>>>>> I checked that Pig is finding the jar by mis-spelling the filename in
>>>>> the "REGISTER" line (which as expected causes things to fail earlier).
>>>>> Also double-check that the class is in the jar,
>>>>> danbri$ jar -tvf
>>>>> twitter-text-1.3.1-plus-tv.notube.TwitterExtractor.jar | grep tv
>>>>>   0 Tue Mar 01 12:03:04 CET 2011 tv/
>>>>>   0 Tue Mar 01 12:03:04 CET 2011 tv/notube/
>>>>> 1114 Tue Mar 01 13:40:30 CET 2011 tv/notube/TwitterExtractor.class
>>>>>
>>>>> ...so I'm finding myself stuck. I'm sure the answer is staring me in
>>>>> the face, but I can't see it. Perhaps I should just do things properly
>>>>> with "extends EvalFunc<String>" and return the tuples separately
>>>>> anyway...
>>>>>
>>>>> Thanks for any pointers,
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>> package tv.notube;
>>>>> import com.twitter.Extractor;
>>>>> import java.util.List;
>>>>> class TwitterExtractor {
>>>>>
>>>>> public static void main (String[] args) {
>>>>>  String in = args[0];
>>>>>      System.out.println("URLs: " + urls(in));
>>>>> }
>>>>>
>>>>> public static String urls(String tweet) {
>>>>>  Extractor ex = new Extractor();
>>>>>  List urls = ex.extractURLs(tweet);
>>>>>  String o = urls.toString();
>>>>>  return o;
>>>>> }
>>>>> }
>>>>>
>>>>
>>>>
>>>>
>>>
>

Reply via email to