Hi all,

A little while back, I started a project called pygmalion for example scripts 
and UDFs for people using Pig with Cassandra.  Currently there are a few handy 
UDFs in there like:

FromCassandraBag: a way to convert from what Cassandra returns (key:chararray, 
columns:bag {column:tuple (name, value)}) to something more tabular (key, 
value1, value2, value3).  You specify the values you want to project - it's 
good for tabular data.
ToCassandraBag: a way to convert from (key, value1, value2, value3) to what 
Cassandra expects when writing - (key:chararray, columns:bag {column:tuple 
(name, value)}) - the column names are extracted from the variable names in the 
Pig script.
Both contributed by Jacob Perkins with slight revisions by Jeremy Hanna

StringConcat: probably something everyone implements but instead of CONCAT that 
only does two strings, it does any number of strings.

GenerateTimeUUID: a udf that generates a time uuid with or without a time to 
base it on.

https://github.com/jeromatron/pygmalion/

It definitely needs more work and examples, but I've been using the UDFs in 
there for a while with Cassandra 0.7.5 (previously 0.7-branch).  Now that 0.7.5 
is released, I'd just like to let people know about it if they would like to 
contribute or even just use it.

Reply via email to