[
https://issues.apache.org/jira/browse/PIG-2927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479567#comment-13479567
]
Cheolsoo Park commented on PIG-2927:
------------------------------------
Although I am no Ruby expert, I think that Jonathan's patch works well. Here is
my test.
1) installed a non-trivial rubygem library (rubygem-json) on the client only
and confirmed that it is not installed on any datanode on the cluster.
{code}
/usr/lib/ruby/gems/1.8/gems/json-1.4.6/
{code}
2) wrote a ruby udf that parses json string:
{code}
require 'rubygems'
require 'pigudf'
require 'json'
class Myudfs < PigUdf
outputSchema "result:chararray"
def parseJson input
result = JSON.parse(input)
end
end
{code}
3) wrote a short pig script that loads a jsonstring and calls my ruby udf:
{code}
register 'test.rb' using jruby as myfuncs;
a = load 'json.txt' using PigStorage() as (i:chararray);
b = foreach a generate myfuncs.parseJson(i);
dump b;
{code}
4) got the expected result as follows:
{code:title=input}
{"id":1,"nested":{"value1":"first1","next":{"complex_record":{"id":2,"nested":{"value1":"second1","next":null,"value2":"second2"}}},"value2":"first2"}}
{code}
{code:title=result}
([id#1,nested#{value1=first1, value2=first2, next={complex_record={id=2,
nested={value1=second1, value2=second2, next=null}}}}])
{code}
Without Jonathan's patch, I get the following error in the front-end as
expected:
{code}
LoadError: no such file to load -- json
require at org/jruby/RubyKernel.java:1042
require at
file:/home/cheolsoo/pig-ruby/build/ivy/lib/Pig/jruby-complete-1.6.7.jar!/META-INF/jruby.home/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:36
(root) at test.rb:3
2012-10-18 17:09:24,323 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR
2999: Unexpected internal error. (LoadError) no such file to load -- json
{code}
I also ran the "Scripting" e2e test cases with the patch on a Hadoop-1.0.x
cluster, and they all passed. So it seems good to commit to me.
Btw, I wanted to write an e2e test case using rubygems-json, but I realized
that rubygems-json is under GPL and can't include in Pig. We should either find
another rubygem library that is under the Apache licence or make the test
configurable so that it will run only if rubygem-json is installed.
Thanks!
> SHIP and use JRuby gems in JRuby UDFs
> -------------------------------------
>
> Key: PIG-2927
> URL: https://issues.apache.org/jira/browse/PIG-2927
> Project: Pig
> Issue Type: New Feature
> Components: parser
> Affects Versions: 0.11
> Environment: JRuby UDFs
> Reporter: Russell Jurney
> Assignee: Jonathan Coveney
> Priority: Minor
> Fix For: 0.11
>
> Attachments: PIG-2927-0.patch, PIG-2927-1.patch, PIG-2927-2.patch,
> PIG-2927-3.patch
>
>
> It would be great to use JRuby gems in JRuby UDFs without installing them on
> all machines on the cluster. Some way to SHIP them automatically with the job
> would be great.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira