For posterity, I ended up hacking around this by renaming the repeated
'value' alias in CassandraStorage and rebuilding it. Here's the patch:

--- src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java.original
2011-10-11
23:42:19.000000000 -0700
+++ src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java 2011-10-11
23:44:26.000000000 -0700
@@ -357,7 +357,7 @@
             validator = validators.get(cdef.getName());
             if (validator == null)
                 validator = marshallers.get(1);
-            valSchema.setName("value");
+            valSchema.setName("value_"+new String(cdef.getName()));
             valSchema.setType(getPigType(validator));
             tupleFields.add(valSchema);
         }

I'm not suggesting this is a correct fix, but it does allow me to move
forward. Another suggestion was to try Pig 0.8.1 instead, but I ran into
https://cwiki.apache.org/confluence/display/PIG/FAQ#FAQ-Q%3AWhatshallIdoifIsaw%22FailedtocreateDataStorage%22%3F

On Tue, Oct 11, 2011 at 10:34 PM, Pete Warden <p...@jetpac.com> wrote:

> Thanks for all your help Brandon and Jeremy, that got me to the point where
> I could load data.
>
> I'm now hitting a new issue that seems like it could possibly be related.
> When I try to access the data like this:
>
> grunt> rows = LOAD 'cassandra://Frap/FriendsAlreadyRanked' USING
> CassandraStorage();
> grunt> parts = FOREACH rows GENERATE key,
> FromCassandraBag('time_last_ranked', columns);
>
> I see the following error:
>
> 2011-10-11 22:23:43,877 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1108:
> <line 4, column 71> Duplicate schema alias: value in "columns"
>
> At first I thought it might be related to the Pygmalion helper functions,
> so I tried to strip it back to basics using this second line instead:
>
> parts = FOREACH rows GENERATE key,$1;
>
> and I still get an identical error.
>
> Any further thoughts on how I can dig into this?
>
> Thanks again,
>                     Pete
>
> On Tue, Oct 11, 2011 at 3:37 PM, Brandon Williams <dri...@gmail.com>wrote:
>
>> On Tue, Oct 11, 2011 at 4:24 PM, Pete Warden <p...@petewarden.com> wrote:
>> > I'm trying to run the most basic example for pig_cassandra, counting the
>> > number of rows in a column family, and I'm hitting the following error:
>> > 2011-10-11 14:13:32,321 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>> > ERROR 1031: Incompatable field schema: left is
>> > "columns:bag{:tuple(name:bytearray,value:bytearray)}", right is
>> >
>> "columns:bag{:tuple(name:chararray,value:bytearray,time_last_ranked:chararray,value:bytearray)}"
>>
>> After https://issues.apache.org/jira/browse/CASSANDRA-2777 you need to
>> remove the 'AS' and everything after it; your schema definition
>> conflicts with what was inferred.
>>
>> -Brandon
>>
>
>

Reply via email to