Restarting may be a temporary workaround but cant be a permanent solution.
After some days, the problem will come back again.
ThanksAnuj
Sent from Yahoo Mail on Android
On Thu, 29 Sep, 2016 at 12:54 AM, sai krishnam raju
potturi wrote: restarting the cassandra service helped
get rid of
>
> Forgot to set replication for new data center :(
I was feeling like it could be it :-). From the other thread:
> It should be ran from DC3 servers, after altering keyspace to add
> keyspaces to the new datacenter. Is this the way you're doing it?
>
>- Are all the nodes using the same ve
Forgot to set replication for new data center :(
On Wed, Sep 28, 2016 at 11:33 PM, Jonathan Haddad wrote:
> What was the reason?
>
> On Wed, Sep 28, 2016 at 9:58 AM techpyaasa . wrote:
>
>> Very sorry...I got the reason for this issue..
>> Please ignore.
>>
>>
>> On Wed, Sep 28, 2016 at 10:14 P
There have been a history of leaks where repairs are multiple repairs were run
on the same node at the same time ( e.g:
https://issues.apache.org/jira/browse/CASSANDRA-11215 )
You’re running a very old version of Cassandra. If you’re able to upgrade to
newest 2.1 or 2.2, it’s likely that at
restarting the cassandra service helped get rid of those files in our
situation.
thanks
Sai
On Wed, Sep 28, 2016 at 3:15 PM, Anuj Wadehra
wrote:
> Hi,
>
> We are facing an issue where Cassandra has open file handles for deleted
> sstable files. These open file handles keep on increasing with ti
Hi,
We are facing an issue where Cassandra has open file handles for deleted
sstable files. These open file handles keep on increasing with time and
eventually lead to disk crisis. This is visible via lsof command.
There are no Exceptions in logs.We are suspecting a race condition where
compact
Even when I set a lower request-timeout in order to trigger a timeout,
still no WARN or ERROR in the logs
On Wed, Sep 28, 2016 at 8:22 PM, George Sigletos
wrote:
> Hi Joaquin,
>
> Unfortunately neither WARN nor ERROR found in the system logs across the
> cluster when executing truncate. Sometime
Hi Joaquin,
Unfortunately neither WARN nor ERROR found in the system logs across the
cluster when executing truncate. Sometimes it executes immediately, other
times it takes 25 seconds, given that I have connected with
--request-timeout=30 seconds.
The nodes are a bit busy compacting. On a freshl
What was the reason?
On Wed, Sep 28, 2016 at 9:58 AM techpyaasa . wrote:
> Very sorry...I got the reason for this issue..
> Please ignore.
>
>
> On Wed, Sep 28, 2016 at 10:14 PM, techpyaasa .
> wrote:
>
>> @Paulo
>>
>> We have done changes as you said
>> net.ipv4.tcp_keepalive_time=60
>> net.ip
Hi George,
Try grepping for WARN and ERROR on the system.logs across all nodes when
you run the command. Could you post any of the recent stacktraces that you
see?
Cheers,
Joaquin Casares
Consultant
Austin, TX
Apache Cassandra Consulting
http://www.thelastpickle.com
On Wed, Sep 28, 2016 at 12:
Thanks a lot for your reply.
I understand that truncate is an expensive operation. But throwing a
timeout while truncating a table that is already empty?
A workaround is to set a high --request-timeout when connecting. Even 20
seconds is not always enough
Kind regards,
George
On Wed, Sep 28, 2
Hi guys,
I run a cluster with 5 nodes, cassandra version 3.0.5.
I get this warning:
2016-09-28 17:22:18,480 BigTableWriter.java:171 - Writing large partition...
for some materialized view. Some have values over 500MB. How this affects
performance? What can/should be done? I suppose is a problem
Hi Cassandra-users,
my name is Michael Mirwaldt and I work for financial.com.
I have encountered this problem with Cassandra 3.7 running 4 nodes:
Given the data model
CREATE KEYSPACE mykeyspace WITH replication = {'class': 'SimpleStrategy',
'replication_factor': '2'} AND durable_writes = true;
Truncate does a few things (based on version)
truncate takes snapshots
truncate causes a flush
in very old versions truncate causes a schema migration.
In newer versions like cassandra 3.4 you have this knob.
# How long the coordinator should wait for truncates to complete
# (This can be mu
Very sorry...I got the reason for this issue..
Please ignore.
On Wed, Sep 28, 2016 at 10:14 PM, techpyaasa . wrote:
> @Paulo
>
> We have done changes as you said
> net.ipv4.tcp_keepalive_time=60
> net.ipv4.tcp_keepalive_probes=3
> net.ipv4.tcp_keepalive_intvl=10
>
> and increased streaming_sock
@Paulo
We have done changes as you said
net.ipv4.tcp_keepalive_time=60
net.ipv4.tcp_keepalive_probes=3
net.ipv4.tcp_keepalive_intvl=10
and increased streaming_socket_timeout_in_ms to 48 hours ,
"phi_convict_threshold : 9".
And once again recommissioned new data center (DC3) , ran " nodetool
reb
Hello,
I keep executing a TRUNCATE command on an empty table and it throws
OperationTimedOut randomly:
cassandra@cqlsh> truncate test.mytable;
OperationTimedOut: errors={}, last_host=cassiebeta-01
cassandra@cqlsh> truncate test.mytable;
OperationTimedOut: errors={}, last_host=cassiebeta-01
Havin
Hi techpyaasa,
That was one of my teammate , very sorry for it/multiple threads.
No big deal :-).
*It looks like streams are failing right away when trying to rebuild.?*
> No , after partial streaming of data (around 150 GB - we have around 600
> GB of data on each node) streaming is getting fa
Robert,
You can restart them in any order, that doesn't make a difference afaik.
Cheers
Le mer. 28 sept. 2016 17:10, Robert Sicoie a
écrit :
> Thanks Alexander,
>
> Yes, with tpstats I can see the hanging active repair(s) (output
> attached). For one there are 31 pending repair. On others ther
* NOTICE *
This is the first release signed with key 0xA278B781FE4B2BDA by Michael
Shuler. Debian users will need to add the key to `apt-key` and the
process has been updated on
https://wiki.apache.org/cassandra/DebianPackaging and patch created for
source docs.
Either method will work:
c
Thanks Alexander,
Yes, with tpstats I can see the hanging active repair(s) (output attached).
For one there are 31 pending repair. On others there are less pending
repairs (min 12). Is there any recomandation for the restart order? The one
with more less pending repairs first, perhaps?
Thanks,
Ro
They will show up in nodetool compactionstats :
https://issues.apache.org/jira/browse/CASSANDRA-9098
Did you check nodetool tpstats to see if you didn't have any running repair
session ?
Just to make sure (and if you can actually do it), roll restart the cluster
and try again. Repair sessions can
I am using nodetool compactionstats to check for pending compactions and it
shows me 0 pending on all nodes, seconds before running nodetool repair.
I am also monitoring PendingCompactions on jmx.
Is there other way I can find out if is there any anticompaction running on
any node?
Thanks a lot,
Robert,
you need to make sure you have no repair session currently running on your
cluster, and no anticompaction.
I'd recommend doing a rolling restart in order to stop all running repair
for sure, then start the process again, node by node, checking that no
anticompaction is running before movin
My feeling here is some of the repair jobs remained somehow pending, and
now when I try to run repair on those sstables I get the "Cannot start
multiple repair sessions over the same sstables" exception.
I checked with nodetool compactionstats for pending tasks before running
nodetool repair, and
@Alain
That was one of my teammate , very sorry for it/multiple threads.
*It looks like streams are failing right away when trying to rebuild.?*
No , after partial streaming of data (around 150 GB - we have around 600 GB
of data on each node) streaming is getting failed with the above exception
st
Thanks Alexander,
Now I started to run the repair with -pr arg and with keyspace and table
args.
Still, I got the "ERROR [RepairJobTask:1] 2016-09-28 11:34:38,288
RepairRunnable.java:246 - Repair session
89af4d10-856f-11e6-b28f-df99132d7979 for range
[(8323429577695061526,8326640819362122791],
...
Just saw a very similar question from Laxmikanth (laxmikanth...@gmail.com)
on an other thread, with the same logs.
Would you mind to avoid splitting multiple threads, to gather up
informations so we can better help you from this mailing list?
C*heers,
2016-09-28 14:28 GMT+02:00 Alain RODRIGUEZ :
Hi,
It looks like streams are failing right away when trying to rebuild.
- Could you please share with us the command you used?
It should be ran from DC3 servers, after altering keyspace to add keyspaces
to the new datacenter. Is this the way you're doing it?
- Are all the nodes using t
There were a few streaming bugs fixed between 2.1.13 and 2.1.15 (see
CHANGES.txt for more details), so I'd recommend you to upgrade to 2.1.15 in
order to avoid having those.
2016-09-28 9:08 GMT-03:00 Alain RODRIGUEZ :
> Hi Anubhav,
>
>
>> I’m considering doing subrange repairs (https://github.com
Hi Anubhav,
> I’m considering doing subrange repairs (https://github.com/
> BrianGallew/cassandra_range_repair/blob/master/src/range_repair.py)
>
I used this script a lot, and quite successfully.
An other working option that people are using is:
https://github.com/spotify/cassandra-reaper
Ale
>
> I've read from some that the gossip info will stay
> around for 72h before being removed.
>
I've read this one too :-). It is 3 days indeed.
This might be of some interest:
https://issues.apache.org/jira/browse/CASSANDRA-10371 (Fix Version/s:
2.1.14, 2.2.6, 3.0.4, 3.4)
C*heers,
Hi,
nodetool scrub won't help here, as what you're experiencing is most likely
that one SSTable is going through anticompaction, and then another node is
asking for a Merkle tree that involves it.
For understandable reasons, an SSTable cannot be anticompacted and
validation compacted at the same t
Hi guys,
I have a cluster of 5 nodes, cassandra 3.0.5.
I was running nodetool repair last days, one node at a time, when I first
encountered this exception
*ERROR [ValidationExecutor:11] 2016-09-27 16:12:20,409
CassandraDaemon.java:195 - Exception in thread
Thread[ValidationExecutor:11,1,main]*
*
34 matches
Mail list logo