The limit defaults to 1024 but you can set it when you use CassandraStorage in pig, like so: rows = LOAD 'cassandra://Keyspace/ColumnFamily' USING CassandraStorage(4096); or whatever value you wish.
Give that a try and see if it gives you more of what you're looking for. On Mar 24, 2011, at 1:16 PM, Jeffrey Wang wrote: > Hey all, > > I’m trying to run a very simple Pig script against my Cassandra cluster (5 > nodes, 0.7.3). I’ve gotten it all set up and working, but the script is > giving me some strange results. Here is my script: > > rows = LOAD 'cassandra://Keyspace/ColumnFamily' USING CassandraStorage(); > rowct = FOREACH rows GENERATE $0, COUNT($1); > dump rowct; > > If I understand Pig correctly, this should output (row name, column count) > tuples, but I’m always seeing 1024 for the column count even though the rows > have highly variable number of columns. Am I missing something? Thanks. > > -Jeffrey >