What is your query consistency? On Fri, Nov 6, 2015 at 1:47 PM, Greg Traub <randomciti...@gmail.com> wrote:
> Cassandra users, > > I have a 4 node Cassandra cluster set up. All nodes are in a single rack > and distribution center. I have a loader program which loads 40 million > rows into a table in a keyspace with a replication factor of 3. > Immediately after inserting the rows (after the loader program finishes), > if I SELECT count(*) from the table, the result is less than 40 million. > If I run our dumper program to retrieve all rows, it is less than 40 > million. However, if I wait roughly 20 minutes, the count eventually > reaches 40 million rows and the dumper program returns all 40 million. > > If I do the same thing in a keyspace where the replication factor is 1, I > don't have any "stabilization" time and the 40 million rows are immediately > available. > > I've modified the loading and dumping programs to use both the Thrift Java > driver and the CQL Java driver and neither seems to make a difference. > > I'm very new to Cassandra and my questions are, what may be causing this > delay in all rows being available and how might I lessen/eliminate this > delay? > > Thanks, > Greg > -- Vidur Malik [image: ShopKeep] <http://www.shopkeep.com> 800.820.9814 <8008209814> [image: ShopKeep] <https://www.facebook.com/ShopKeepPOS> [image: ShopKeep] <https://twitter.com/shopkeep> [image: ShopKeep] <https://instagram.com/shopkeep/>