Re: Phantom growth resulting automatically node shutdown

2018-04-23 Thread Fernando Neves
Thank you all guys!
We will plan to upgrade our cluster to the latest 3.11.x version.

2018-04-20 7:09 GMT+08:00 kurt greaves :

> This was fixed (again) in 3.0.15. https://issues.apache.
> org/jira/browse/CASSANDRA-13738
>
> On Fri., 20 Apr. 2018, 00:53 Jeff Jirsa,  wrote:
>
>> There have also been a few sstable ref counting bugs that would over
>> report load in nodetool ring/status due to overlapping normal and
>> incremental repairs (which you should probably avoid doing anyway)
>>
>> --
>> Jeff Jirsa
>>
>>
>> On Apr 19, 2018, at 9:27 AM, Rahul Singh 
>> wrote:
>>
>> I’ve seen something similar in 2.1. Our issue was related to file
>> permissions being flipped due to an automation and C* stopped seeing
>> Sstables so it started making new data — via read repair or repair
>> processes.
>>
>> In your case if nodetool is reporting data that means that it’s growing
>> due to data growth. What does your cfstats / tablestats day? Are you
>> monitoring your key tables data via cfstats metrics like SpaceUsedLive or
>> SpaceUsedTotal. What is your snapshottjng / backup process doing?
>>
>> --
>> Rahul Singh
>> rahul.si...@anant.us
>>
>> Anant Corporation
>>
>> On Apr 19, 2018, 7:01 AM -0500, horschi , wrote:
>>
>> Did you check the number of files in your data folder before & after the
>> restart?
>>
>> I have seen cases where cassandra would keep creating sstables, which
>> disappeared on restart.
>>
>> regards,
>> Christian
>>
>>
>> On Thu, Apr 19, 2018 at 12:18 PM, Fernando Neves <
>> fernando1ne...@gmail.com> wrote:
>>
>>> I am facing one issue with our Cassandra cluster.
>>>
>>> Details: Cassandra 3.0.14, 12 nodes, 7.4TB(JBOD) disk size in each node,
>>> ~3.5TB used physical data in each node, ~42TB whole cluster and default
>>> compaction setup. This size maintain the same because after the retention
>>> period some tables are dropped.
>>>
>>> Issue: Nodetool status is not showing the correct used size in the
>>> output. It keeps increasing the used size without limit until automatically
>>> node shutdown or until our sequential scheduled restart(workaround 3 times
>>> week). After the restart, nodetool shows the correct used space but for few
>>> days.
>>> Did anybody have similar problem? Is it a bug?
>>>
>>> Stackoverflow: https://stackoverflow.com/questions/49668692/cassandra-
>>> nodetool-status-is-not-showing-correct-used-space
>>>
>>>
>>


Repair with -pr option vs Repair Local Datacenter, -pr fails but latter succeeds, how to proceed

2018-04-23 Thread Leena Ghatpande



In context with this earlier post I had
https://www.mail-archive.com/user@cassandra.apache.org/msg56122.html


We run repair on each node in the cluster with the -pr option on every table
within each keyspace individually. Repairs are run sequentially on each node
Repairs fail for all nodes for all tables with -pr option

But if we run the repair with DC option (-dc localdatacenter) for local
datacenters on all nodes for all tables, then all repairs are successfully.

Is this indication that the  repairs are good? can we proceed with with adding 
new nodes and
decomissioning nodes even when individual repairs fail but the DC repairs work

Is there anything else other than scrub that can be performed to fix the repair
issues?

Thanks

Leena



Re: cassandra repair takes ages

2018-04-23 Thread Nuno Cervaens - Hoist Group - Portugal
Hi Carlos,

Ok thanks for the feedback and the url, its pretty clear now.

cheers,
nuno

On Dom, 2018-04-22 at 16:13 +0100, Carlos Rolo wrote:
> Hello,
> 
> I just stated that if you use QUORUM or in fact using ALL, since
> you're running ONE, this is a non-issue.
> 
> Regarding incremental repairs you can read here: http://thelastpickle
> .com/blog/2017/12/14/should-you-use-incremental-repair.html
> 
> You can't run repair -pr simultaneously. You can try to use a tool
> like Reaper to better manage and schedule repairs, but I doubt it
> will speed up a lot.
> 
> Regards,
> 
> Carlos Juzarte Rolo
> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>  
> Pythian - Love your data
> 
> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
> linkedin.com/in/carlosjuzarterolo 
> Mobile: +351 918 918 100 
> www.pythian.com
> 
> On Sun, Apr 22, 2018 at 11:39 AM, Nuno Cervaens - Hoist Group -
> Portugal  wrote:
> > Hi Carlos,
> > 
> > Thanks for the reply.
> > Isnt the consistency level defined per session? All my session,
> > being for read or write as defaulted to ONE.
> > 
> > Movid to SSD is for sure an obvious improvement but not possible at
> > the moment.
> > My goal is to really spend the lowest time possible on running a
> > repair throughout all the nodes.
> > Are there any more downsides to run nodetool repair -pr
> > simultaneously on each node, besides the cpu and mem overload?
> > Also if someone can clarify about the safety of an incremental
> > repair.
> > 
> > thanks,
> > nuno
> > From: Carlos Rolo 
> > Sent: Friday, April 20, 2018 4:55:21 PM
> > To: user@cassandra.apache.org
> > Subject: Re: cassandra repair takes ages
> >  
> > Changing the datadrives to SSD would help to speed up the repairs.
> > 
> > Also don't run 3 node, RF2. That makes Quorum = All. 
> > 
> > Regards,
> > 
> > Carlos Juzarte Rolo
> > Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
> >  
> > Pythian - Love your data
> > 
> > rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
> > linkedin.com/in/carlosjuzarterolo 
> > Mobile: +351 918 918 100 
> > www.pythian.com
> > 
> > On Fri, Apr 20, 2018 at 4:42 PM, Nuno Cervaens - Hoist Group -
> > Portugal  wrote:
> > Hello,
> > 
> > I have a 3 node cluster with RF 2 and using STCS. I use SSDs for
> > commitlogs and HDDs for data. Apache Cassandra version is 3.11.2.
> > I basically have a huge keyspace ('newts' from opennms) and a big
> > keyspace ('opspanel'). Here's a summary of the 'du' output for one
> > node (which is more or less the same for each node):
> > 
> > 51G
> > ./data/opspanel
> > 776G
> > ./data/newts/samples-00ae9420ea0711e5a39bbd7839a19930
> > 776G
> > ./data/newts
> > 
> > My issue is that running a 'nodetool repair -pr' takes one day an a
> > half per node and as I want to store daily snapshots (for the past
> > 7 days), I dont see how I can do this as repairs take too long.
> > For example I see huge compactions and validations that take lots
> > of hours (compactionstats taken at different times):
> > 
> > id   compaction type keyspace
> > table   completedtotalunit  progress
> > 7125eb20-446b-11e8-a57d-f36e88375e31
> > Compaction  newtssamples 294177987449 835153786347 bytes
> > 35,22% 
> > 
> > id   compaction
> > type keyspace
> > table   completedtotalunit  progress
> > 6aa5ce51-4425-11e8-a7c1-572dede7e4d6 Anticompaction after repair
> > newtssamples 581839334815 599408876344 bytes 97,07%  
> > 
> > id   compaction type keyspace
> > table   completedtotalunit  progress
> > 69976700-43e2-11e8-a7c1-572dede7e4d6
> > Validation  newtssamples 63249761990  826302170493 bytes
> > 7,65%   
> > 69973ff0-43e2-11e8-a7c1-572dede7e4d6
> > Validation  newtssamples 102513762816 826302170600 bytes
> > 12,41%
> > 
> > Is there something I can do to improve the situation?
> > 
> > Also, is an incremental repair (apparently nodetool's default)
> > safe? As I see in the datastax documentation that the incremental
> > should not be used, only the full. Can you please clarify?
> > 
> > Thanks for the feedback.
> > Nuno
> > 
> > 
> > --
> > 
> > 
> > 
> 
> --
> 
> 

Re: GUI clients for Cassandra

2018-04-23 Thread Rahul Singh
Zeppelin and Dbeaver EE are both good.

--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

On Apr 23, 2018, 12:53 AM -0400, Eunsu Kim , wrote:
> I am now writing dbeaver EE, but I’m waiting for TeamSQL (https://teamsql.io) 
> to support cassandra.
>
> > On 23 Apr 2018, at 7:56 AM, Tim Moore  wrote:
> >
> > I use the command-line too, but have heard some recommendations for DBeaver 
> > EE as a cross-database GUI with support for Cassandra: https://dbeaver.com/
> >
> > > On Sun, Apr 22, 2018 at 3:58 PM, Hannu Kröger  wrote:
> > > > Hello everyone!
> > > >
> > > > I have been asked many times that what is a good GUI client for 
> > > > Cassandra. DevCenter is not available anymore and DataStax has a 
> > > > DevStudio but that’s for DSE only.
> > > >
> > > > Are there some 3rd party GUI tools that you are using a lot? I always 
> > > > use the command line client myself. I have tried to look for some 
> > > > Cassandra related tools but I haven’t found any good one yet.
> > > >
> > > > Cheers,
> > > > Hannu
> > > > -
> > > > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> > > > For additional commands, e-mail: user-h...@cassandra.apache.org
> > > >
> >
> >
> >
> > --
> > Tim Moore
> > Lagom Tech Lead, Lightbend, Inc.
> > tim.mo...@lightbend.com
> > +61 420 981 589
> > Skype: timothy.m.moore
> >
>


Re: GUI clients for Cassandra

2018-04-23 Thread Christophe Schmitz
Hi Hannu ;)



>
> I have been asked many times that what is a good GUI client for Cassandra.
>> DevCenter is not available anymore and DataStax has a DevStudio but that’s
>> for DSE only.
>>
>
 DevCenter is still available, I just downloaded it.

Cheers,
Christophe



-- 

*Christophe Schmitz - **VP Consulting*

AU: +61 4 03751980 / FR: +33 7 82022899

   



Read our latest technical blog posts here
. This email has been sent on behalf
of Instaclustr Pty. Limited (Australia) and Instaclustr Inc (USA). This
email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


Re: Memtable type and size allocation

2018-04-23 Thread kurt greaves
Hi Vishal,

In Cassandra 3.11.2, there are 3 choices for the type of Memtable
> allocation and as per my understanding, if I want to keep Memtables on JVM
> heap I can use heap_buffers and if I want to store Memtables outside of JVM
> heap then I've got 2 options offheap_buffers and offheap_objects.

Heap buffers == everything is allocated on heap, e.g the entire row and its
contents.
Offheap_buffers is partially on heap partially offheap. It moves the Cell
name + value to offheap buffers. Not sure how much this has changed in 3.x
Offheap_objects moves entire cells offheap and we only keep a reference to
them on heap.

Also, the permitted memory space to be used for Memtables can be set at 2
> places in the YAML file, i.e. memtable_heap_space_in_mb and
> memtable_offheap_space_in_mb.

 Do I need to configure some space in both heap and offheap, irrespective
> of the Memtable allocation type or do I need to set only one of them based
> on my Memtable allocation type i.e. memtable_heap_space_in_mb when using
> heap buffers and memtable_offheap_space_in_mb only when using either of the
> other 2 offheap options?


Both are still relevant and used if using offheap. If not using an offheap
option only memtable_heap_space_in_mb is relevant. For the most part, the
defaults (1/4 of heap size) should be sufficient.