Re: Upgrade to v3.11.3

2019-01-17 Thread shalom sagges
Thanks a lot Anuj! On Wed, Jan 16, 2019 at 4:56 PM Anuj Wadehra wrote: > Hi Shalom, > > Just a suggestion. Before upgrading to 3.11.3 make sure you are not > impacted by any open crtitical defects especially related to RT which may > cause data loss e.g.14861. > > Please find my response below

Partition key with 300K rows can it be queried and distributed using Spark

2019-01-17 Thread Goutham reddy
Hi, As each partition key can hold up to 2 Billion rows, even then it is an anti-pattern to have such huge data set for one partition key in our case it is 300k rows only, but when trying to query for one particular key we are getting timeout exception. If I use Spark to get the 300k rows for a par

Re: Partition key with 300K rows can it be queried and distributed using Spark

2019-01-17 Thread Nitan Kainth
Not sure about spark data distribution but yeah spark can be used to retrieve such data from Cassandra. Regards, Nitan Cell: 510 449 9629 > On Jan 17, 2019, at 2:15 PM, Goutham reddy wrote: > > Hi, > As each partition key can hold up to 2 Billion rows, even then it is an > anti-pattern to ha

Re: Partition key with 300K rows can it be queried and distributed using Spark

2019-01-17 Thread Jeff Jirsa
The reason big rows are painful in Cassandra is that by default, we index it every 64kb. With 300k objects, it may or may not have a lot of those little index blocks/objects. How big is each row? If you try to read it and it's very wide, you may see heap pressure / GC. If so, you could try changin

Re: Partition key with 300K rows can it be queried and distributed using Spark

2019-01-17 Thread Goutham reddy
Thanks Jeff, yes we have 18 columns in total. But my question was does spark can retrieve data by partitioning 300k data into spark nodes? On Thu, Jan 17, 2019 at 1:30 PM Jeff Jirsa wrote: > The reason big rows are painful in Cassandra is that by default, we index > it every 64kb. With 300k obje