hi Chad . I have this issue
I send a mail to user-pig-list and I still i can resolve this, and I can not access to column values. In this mail I write some things that I try without results... and information about this issue. http://mail-archives.apache.org/mod_mbox/pig-user/201308.mbox/%3ccajeg_hq9s2po3_xytzx5xki4j1mao8q26jydg2wndy_kyiv...@mail.gmail.com%3E I hope someOne reply one comment, idea or solution about this issue or bug. I have reviewed the CqlStorage class in code cassandra 1.2.8 but i do not have configure the environmetn to debug and trace this issue. Only I find some comments like, but I do not understand at all. /** * A LoadStoreFunc for retrieving data from and storing data to Cassandra * * A row from a standard CF will be returned as nested tuples: * (((key1, value1), (key2, value2)), ((name1, val1), (name2, val2))). */ I you found some idea or solution, please post it thanks 2013/8/23 Chad Johnston <cjohns...@megatome.com> > (I'm using Cassandra 1.2.8 and Pig 0.11.1) > > I'm loading some simple data from Cassandra into Pig using CqlStorage. The > CqlStorage loader defines a Pig schema based on the Cassandra schema, but > it seems to be wrong. > > If I do: > > data = LOAD 'cql://bookdata/books' USING CqlStorage(); > DESCRIBE data; > > I get this: > > data: {isbn: chararray,bookauthor: chararray,booktitle: > chararray,publisher: chararray,yearofpublication: int} > > However, if I DUMP data, I get results like these: > > ((isbn,0425093387),(bookauthor,Georgette Heyer),(booktitle,Death in the > Stocks),(publisher,Berkley Pub Group),(yearofpublication,1986)) > > Clearly the results from Cassandra are key/value pairs, as would be > expected. I don't know why the schema generated by CqlStorage() would be so > different. > > This is really causing me problems trying to access the column values. I > tried a naive approach of FLATTENing each tuple, then trying to access the > values that way: > > flattened = FOREACH data GENERATE > FLATTEN(isbn), > FLATTEN(booktitle), > ... > values = FOREACH flattened GENERATE > $1 AS ISBN, > $3 AS BookTitle, > ... > > As soon as I try to access field $5, Pig complains about the index being > out of bounds. > > Is there a way to solve the schema/reality mismatch? Am I doing something > wrong, or have I stumbled across a defect? > > Thanks, > Chad >