Join_ring=false Use Cases
Hi, I need to understand the use case of join_ring=false in case of node outages. As per https://issues.apache.org/jira/browse/CASSANDRA-6961, you would want join_ring=false when you have to repair a node before bringing a node back after some considerable outage. The problem I see with join_ring=false is that unlike autobootstrap, the node will NOT accept writes while you are running repair on it. If a node was down for 5 hours and you bring it back with join_ring=false, repair the node for 7 hours and then make it join the ring, it will STILL have missed writes because while the time repair was running (7 hrs), writes only went to other others. So, if you want to make sure that reads served by the restored node at CL ONE will return consistent data after the node has joined, you wont get that as writes have been missed while the node is being repaired. And if you work with Read/Write CL=QUORUM, even if you bring back the node without join_ring=false, you would anyways get the desired consistency. So, how join_ring would provide any additional consistency in this case ?? I can see join_ring=false useful only when I am restoring from Snapshot or bootstrapping and there are dropped mutations in my cluster which are not fixed by hinted handoff. For Example: 3 nodes A,B,C working at Read/Write CL QUORUM. Hinted Handoff=3 hrs.10 AM Snapshot taken on all 3 nodes11 AM: Node B goes down for 4 hours3 PM: Node B comes up but data is not repaired. So, 1 hr of dropped mutations (2-3 PM) not fixed via Hinted Handoff.5 PM: Node A crashes.6 PM: Node A restored from 10 AM Snapshot, Node A started with join_ring=false, repaired and then joined the cluster. In above restore snapshot example, updates from 2-3 PM were outside hinted handoff window of 3 hours. Thus, node B wont get those updates. Node A data for 2-3 PM is already lost. So, 2-3 PM updates are only on one replica i.e. node C and minimum consistency needed is QUORUM so join_ring=false would help. But this is very specific use case. ThanksAnuj
Re: Configure NTP for Cassandra
Any NTP experts willing to take up these questions? Thanks Anuj On Sun, 27 Nov, 2016 at 12:52 AM, Anuj Wadehra wrote: Hi, One popular NTP setup recommended for Cassandra users is described at Thankshttps://blog.logentries.com/2014/03/synchronizing-clocks-in-a-cassandra-cluster-pt-2-solutions/ . Summary of article is:Setup recommends a dedicated pool of internal NTP servers which are associated as peers to provide a HA NTP service. Cassandra nodes sync to this dedicated pool but define one internal NTP server as preferred server to ensure relative clock synchronization. Internal NTP servers sync to external NTP servers. My questions: 1. If my ISP provider is providing me a pool of reliable NTP servers, should I setup my own internal servers anyway or can I sync Cassandra nodes directly to the ISP provided servers and select one of the servers as preferred for relative clock synchronization? I agree. If you have to rely on public NTP pool which selects random servers for sync, having an internal NTP server pool is justified for getting tight relative sync as described in the blog 2. As per my understanding, peer association is ONLY for backup scenario . If a peer loses time synchronization source, then other peers can be used for time synchronization. Thus providing a HA service. But when everything is ok (happy path), does defining NTP servers synced from different sources as peers lead them to converge time as mentioned in some forums? e.g. if A and B are peers and thier times are 9:00:00 and 9:00:10 after syncing with respective time sources, then will they converge their clocks as 9:00:05? I doubt the above claim regarding time converge. Also no formal doc says that. Comments? ThanksAnuj
Re: Configure NTP for Cassandra
You might find more NTP experts on the NTP questions mailing list: http://lists.ntp.org/listinfo/questions On Tue, Dec 13, 2016 at 1:25 PM, Anuj Wadehra wrote: > Any NTP experts willing to take up these questions? > > Thanks > Anuj > > On Sun, 27 Nov, 2016 at 12:52 AM, Anuj Wadehra > wrote: > Hi, > > One popular NTP setup recommended for Cassandra users is described at > Thankshttps://blog.logentries.com/2014/03/synchronizing-clocks-in-a-cassandra-cluster-pt-2-solutions/ > . > > Summary of article is: > Setup recommends a dedicated pool of internal NTP servers which are > associated as peers to provide a HA NTP service. Cassandra nodes sync to > this dedicated pool but define one internal NTP server as preferred server > to ensure relative clock synchronization. Internal NTP servers sync to > external NTP servers. > > My questions: > > 1. If my ISP provider is providing me a pool of reliable NTP servers, should > I setup my own internal servers anyway or can I sync Cassandra nodes > directly to the ISP provided servers and select one of the servers as > preferred for relative clock synchronization? > > > I agree. If you have to rely on public NTP pool which selects random servers > for sync, having an internal NTP server pool is justified for getting tight > relative sync as described in the blog > > 2. As per my understanding, peer association is ONLY for backup scenario . > If a peer loses time synchronization source, then other peers can be used > for time synchronization. Thus providing a HA service. But when everything > is ok (happy path), does defining NTP servers synced from different sources > as peers lead them to converge time as mentioned in some forums? > > e.g. if A and B are peers and thier times are 9:00:00 and 9:00:10 after > syncing with respective time sources, then will they converge their clocks > as 9:00:05? > > I doubt the above claim regarding time converge. Also no formal doc says > that. Comments? > > > Thanks > Anuj >
Re: Configure NTP for Cassandra
Thanks for the NTP link. Most of us are Cassandra users and must be using NTP (or other time synchronization methods) for ensuring relative time synchronization in our Cassandra clusters. I hope there are people on the mailing list who can answer these questions with respect to Cassandra. There is just one detailed blog on NTP best practices for Cassandra and I think answering these questions is important rather than just creating an internal NTP pool with recommended settings. Thanks Anuj On Wed, 14 Dec, 2016 at 12:07 AM, Jim Witschey wrote: You might find more NTP experts on the NTP questions mailing list: http://lists.ntp.org/listinfo/questions On Tue, Dec 13, 2016 at 1:25 PM, Anuj Wadehra wrote: > Any NTP experts willing to take up these questions? > > Thanks > Anuj > > On Sun, 27 Nov, 2016 at 12:52 AM, Anuj Wadehra > wrote: > Hi, > > One popular NTP setup recommended for Cassandra users is described at > Thankshttps://blog.logentries.com/2014/03/synchronizing-clocks-in-a-cassandra-cluster-pt-2-solutions/ > . > > Summary of article is: > Setup recommends a dedicated pool of internal NTP servers which are > associated as peers to provide a HA NTP service. Cassandra nodes sync to > this dedicated pool but define one internal NTP server as preferred server > to ensure relative clock synchronization. Internal NTP servers sync to > external NTP servers. > > My questions: > > 1. If my ISP provider is providing me a pool of reliable NTP servers, should > I setup my own internal servers anyway or can I sync Cassandra nodes > directly to the ISP provided servers and select one of the servers as > preferred for relative clock synchronization? > > > I agree. If you have to rely on public NTP pool which selects random servers > for sync, having an internal NTP server pool is justified for getting tight > relative sync as described in the blog > > 2. As per my understanding, peer association is ONLY for backup scenario . > If a peer loses time synchronization source, then other peers can be used > for time synchronization. Thus providing a HA service. But when everything > is ok (happy path), does defining NTP servers synced from different sources > as peers lead them to converge time as mentioned in some forums? > > e.g. if A and B are peers and thier times are 9:00:00 and 9:00:10 after > syncing with respective time sources, then will they converge their clocks > as 9:00:05? > > I doubt the above claim regarding time converge. Also no formal doc says > that. Comments? > > > Thanks > Anuj >
Are Materialized views persisted on disk?
Are Materialized views persisted on disk? sorry for the naive question.
Re: Are Materialized views persisted on disk?
Yes, they are stored on disk like a normal table. On Tue, Dec 13, 2016 at 2:31 PM, Kant Kodali wrote: > Are Materialized views persisted on disk? sorry for the naive question. >
Re: Are Materialized views persisted on disk?
The word "materialized" implies that. 2016-12-13 20:34 GMT+01:00 Carl Yeksigian : > Yes, they are stored on disk like a normal table. > > On Tue, Dec 13, 2016 at 2:31 PM, Kant Kodali wrote: > >> Are Materialized views persisted on disk? sorry for the naive question. >> > > -- Benjamin Roth Prokurist Jaumo GmbH · www.jaumo.com Wehrstraße 46 · 73035 Göppingen · Germany Phone +49 7161 304880-6 · Fax +49 7161 304880-1 AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
Re: Are Materialized views persisted on disk?
People should be able to ask legit questions here without getting snarky answers, please don't do that. Not everyone has the same background or knowledge that you do. On Tue, Dec 13, 2016 at 11:49 AM Benjamin Roth wrote: > The word "materialized" implies that. > > 2016-12-13 20:34 GMT+01:00 Carl Yeksigian : > > Yes, they are stored on disk like a normal table. > > On Tue, Dec 13, 2016 at 2:31 PM, Kant Kodali wrote: > > Are Materialized views persisted on disk? sorry for the naive question. > > > > > > -- > Benjamin Roth > Prokurist > > Jaumo GmbH · www.jaumo.com > Wehrstraße 46 · 73035 Göppingen · Germany > Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1 > <+49%207161%203048801> > AG Ulm · HRB 731058 · Managing Director: Jens Kammerer >
Re: Are Materialized views persisted on disk?
It wasn't meant in a snarky way, it was as (too short) explanation. I try to sum it up: Materialized View: The data that is represented by the view is stored persistently and updated as soon as the underlying base data changes. On RDBMS: Pro: Fast reads, Con: Slow(er) updates On CS: Used to do filtering or sorting of the base table. Much slower write path. "Regular" View: The base data is queried on demand. More or less a rewrite or alias of another query. On RDBMS: Pro: No updates required, Con: Probably slow reads, depending on indexes. On CS: Does not exist. The term "materialized view" has been established by well known RDBMS like oracle and behaves very similar in CS. In most RDBMS a view can have many base tables. In CS an MV can have only one base table and has many more restrictions compared to RDBMS. 2016-12-13 21:06 GMT+01:00 Jonathan Haddad : > People should be able to ask legit questions here without getting snarky > answers, please don't do that. Not everyone has the same background or > knowledge that you do. > > On Tue, Dec 13, 2016 at 11:49 AM Benjamin Roth > wrote: > >> The word "materialized" implies that. >> >> 2016-12-13 20:34 GMT+01:00 Carl Yeksigian : >> >> Yes, they are stored on disk like a normal table. >> >> On Tue, Dec 13, 2016 at 2:31 PM, Kant Kodali wrote: >> >> Are Materialized views persisted on disk? sorry for the naive question. >> >> >> >> >> >> -- >> Benjamin Roth >> Prokurist >> >> Jaumo GmbH · www.jaumo.com >> Wehrstraße 46 · 73035 Göppingen · Germany >> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1 >> <+49%207161%203048801> >> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer >> > -- Benjamin Roth Prokurist Jaumo GmbH · www.jaumo.com Wehrstraße 46 · 73035 Göppingen · Germany Phone +49 7161 304880-6 · Fax +49 7161 304880-1 AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
Re: Configure NTP for Cassandra
2016-11-26 20:20 GMT+01:00 Anuj Wadehra : > 1. If my ISP provider is providing me a pool of reliable NTP servers, should > I setup my own internal servers anyway or can I sync Cassandra nodes > directly to the ISP provided servers and select one of the servers as > preferred for relative clock synchronization? Set up three ntp servers which uses the provider servers _and_ pool servers and sync your other machines from these servers (and maybe get gps receivers for your ntp servers). This reduces ntp traffic at your firewall (your servers act as proxies) and reduces load on public servers. > 2. As per my understanding, peer association is ONLY for backup scenario . > If a peer loses time synchronization source, then other peers can be used > for time synchronization. Thus providing a HA service. But when everything > is ok (happy path), does defining NTP servers synced from different sources > as peers lead them to converge time as mentioned in some forums? Maybe; but the difference will be negligible (sub milliseconds). I wouldn't worry about that. Best Martin