All, I converted one of my C* programs to Hadoop 2.x and C* datastax drivers for 2.1.0. The original program (Hadoop 1.x) worked fine when we specified InputCQLPageRowSize and InputSplitSize to reasonable values. For example, if we had 60K rows, a row size of 100 and split size of 10000 will run 6 mappers and give us 60K rows. When we switched to 2.1.x version of the datastax drivers, the same program now gives only 600 rows.
It looks like the paging logic has changed and the page size is only getting the first 100 rows. How do we get all the rows? ———————————————————————————————————— [cid:E4089CAC-450F-40E4-8A26-88A74F209FC9] Venky Kandaswamy @WalmartLabs 925-200-7124