Apache Cassandra performance tuning - call for contribution

2022-02-09 Thread Daniel Seybold

Dear Apache Cassandra community,

we plan to run a large case performance study for Apache Cassandra and 
MongoDB where the focus is not to compare both systems directly but to 
answer the question: /how much performance can you get out each DBMS 
with an optimal configuration compared to the vanilla installation?/


In this study, we use three different configurations of the well-known 
Yahoo Cloud Serving (YCSB) benchmark to emulate three types of workloads 
(write-heavy, ready-heavy, mixed).


With these workloads, we stress the DBMS’ hosted on AWS EC2.

In order to get the optimal answer, we need your support as Apache 
Cassandra experts to find the optimal OS and DBMS configuration for the 
outlined workloads (either one general configuration or 
workload-specific configurations).


We will carry out the benchmarks with our Benchmarking-as-a-Service 
(BaaS) platform and include your configurations into the benchmarking 
process.


And of course, we will release all data as open data sets to the 
community and publish the study on our website and distribute it through 
our marketing channels.
Moreover, we will reference you in this study and give you the 
opportunity to introduce yourself and your company as well as comment on 
the results with your experience and assessment.


If you are interested in contributing, feel free to reach out to me.

Cheers,
Daniel




Re: Running enablefullquerylog crashes cassandra

2022-02-09 Thread Gil Ganz
Nothing in the cassandra logs (system/debug)  last lines are
INFO  [RMI TCP Connection(10)-127.0.0.1] 2022-02-06 08:41:50,334
BinLog.java:420 - Attempting to configure bin log: Path: /mnt/fql_data Roll
cycle: HOURLY Blocking: true Max queue weight: 268435456 Max log
size:34359738368 Archive command:
INFO  [RMI TCP Connection(10)-127.0.0.1] 2022-02-06 08:41:50,335
BinLog.java:433 - Cleaning directory: /mnt/fql_data as requested

On Sun, Feb 6, 2022 at 6:43 PM Jeff Jirsa  wrote:

> That looks like a nodetool stack - can you check the Cassandra log for
> corresponding error?
>
> On Feb 6, 2022, at 12:52 AM, Gil Ganz  wrote:
>
> 
> Hey
> I'm trying to enable full query log on cassandra 4.01 node and it's
> causing cassandra to shutdown
>
> nodetool enablefullquerylog --path /mnt/fql_data
>
> Cassandra has shutdown.
> error: null
> -- StackTrace --
> java.io.EOFException
> at java.io.DataInputStream.readByte(DataInputStream.java:267)
> at
> sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:222)
> at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161)
> at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
> at
> javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown Source)
> at
> javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1020)
> at
> javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:298)
> at com.sun.proxy.$Proxy6.enableFullQueryLogger(Unknown Source)
> at
> org.apache.cassandra.tools.NodeProbe.enableFullQueryLogger(NodeProbe.java:1836)
> at
> org.apache.cassandra.tools.nodetool.EnableFullQueryLog.execute(EnableFullQueryLog.java:62)
> at
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.runInternal(NodeTool.java:358)
> at
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:343)
> at org.apache.cassandra.tools.NodeTool.execute(NodeTool.java:246)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:84)
>
> /mnt/fql_data is owned by cassandra user
> Doesn't matter if directory is empty or not
>
> contents of cassandra.yaml
>
> full_query_logging_options:
>  log_dir: /mnt/fql_data
>  roll_cycle: HOURLY
>  block: true
>  max_queue_weight: 268435456
>  max_log_size: 34359738368
> # archive command is "/path/to/script.sh %path" where %path is
> replaced with the file being rolled:
> # archive_command:
> # max_archive_retries: 10
>
> No errors in the log, last couple of lines are
>
> INFO  [RMI TCP Connection(10)-127.0.0.1] 2022-02-06 08:41:50,334
> BinLog.java:420 - Attempting to configure bin log: Path: /mnt/fql_data Roll
> cycle: HOURLY Blocking: true Max queue weight: 268435456 Max log
> size:34359738368 Archive command:
> INFO  [RMI TCP Connection(10)-127.0.0.1] 2022-02-06 08:41:50,335
> BinLog.java:433 - Cleaning directory: /mnt/fql_data as requested
>
> I noticed there is a similiar bug
> https://issues.apache.org/jira/browse/CASSANDRA-17136  but I also tried
> setting disk_failure_policy to ignore, same thing.
> Has Anyone encountered something similar?
>
>
>
>
>
> gil
>
>


Cassandra tarball install and systemd

2022-02-09 Thread Saha, Sushanta K
I picked up the script */etc/init.d/cassandra* from the net. Not sure if
tarball installation includes such a script. It should.

This script is using the following line:
*pid_file=/var/run/cassandra/cassandra.pid*

But, there is no such *.pid* file that I can find. I am starting Cassandra
with $CASSANDRA_HOME/bin/cassandra.

Appreciate any pointer for this auto stop and start of Cassandra.

Thanks & Regards
 Sushanta

-- 

*Sushanta Saha|*MTS IV-Cslt-Sys Engrg|WebIaaS_DB Group|HQ -
* VerizonWireless O 770.797.1260  C 770.714.6555 Iaas Support Line
949-286-8810*


Re: Cassandra tarball install and systemd

2022-02-09 Thread Bowen Song
Isn't the /etc/init.d/cassandra script supposed to create the PID file 
if it doesn't exist? See: 
https://github.com/apache/cassandra/blob/cb1c8f9d34edfa639096d2d122dfd0ee6d23b479/debian/init#L83



On 09/02/2022 15:17, Saha, Sushanta K wrote:
I picked up the script */etc/init.d/cassandra* from the net. Not sure 
if tarball installation includes such a script. It should.


This script is using the following line:
*pid_file=/var/run/cassandra/cassandra.pid*

But, there is no such *.pid* file that I can find. I am starting 
Cassandra with $CASSANDRA_HOME/bin/cassandra.


Appreciate any pointer for this auto stop and start of Cassandra.

Thanks & Regards
 Sushanta

--

*Sushanta Saha|*MTS IV-Cslt-Sys Engrg|WebIaaS_DB Group|HQ 
-*VerizonWireless

O 770.797.1260 C 770.714.6555 Iaas Support Line949-286-8810*


Re: Cassandra tarball install and systemd

2022-02-09 Thread Aaron Ploetz
> I am starting Cassandra with $CASSANDRA_HOME/bin/cassandra

When starting Cassandra, it accepts a PID file location with the -p flag:

$CASSANDRA_HOME/bin/cassandra -p /var/run/cassandra/cassandra.pid

Start Cassandra with that, and then the PID file will be there.  Assuming
of course, that the user starting Cassandra has access to /var/run/casandra/
.

Aaron


On Wed, Feb 9, 2022 at 9:53 AM Bowen Song  wrote:

> Isn't the /etc/init.d/cassandra script supposed to create the PID file if
> it doesn't exist? See:
> https://github.com/apache/cassandra/blob/cb1c8f9d34edfa639096d2d122dfd0ee6d23b479/debian/init#L83
>
>
> On 09/02/2022 15:17, Saha, Sushanta K wrote:
>
> I picked up the script */etc/init.d/cassandra* from the net. Not sure if
> tarball installation includes such a script. It should.
>
> This script is using the following line:
> *pid_file=/var/run/cassandra/cassandra.pid*
>
> But, there is no such *.pid* file that I can find. I am starting
> Cassandra with $CASSANDRA_HOME/bin/cassandra.
>
> Appreciate any pointer for this auto stop and start of Cassandra.
>
> Thanks & Regards
>  Sushanta
>
> --
>
> *Sushanta Saha|*MTS IV-Cslt-Sys Engrg|WebIaaS_DB Group|HQ -
> * VerizonWireless O 770.797.1260  C 770.714.6555 Iaas Support Line
> 949-286-8810*
>
>


Re: [E] Re: Cassandra tarball install and systemd

2022-02-09 Thread Saha, Sushanta K
Thanks Aaron! Appreciate it.

 Sushanta


On Wed, Feb 9, 2022 at 11:13 AM Aaron Ploetz  wrote:

> > I am starting Cassandra with $CASSANDRA_HOME/bin/cassandra
>
> When starting Cassandra, it accepts a PID file location with the -p flag:
>
> $CASSANDRA_HOME/bin/cassandra -p /var/run/cassandra/cassandra.pid
>
> Start Cassandra with that, and then the PID file will be there.  Assuming
> of course, that the user starting Cassandra has access to
> /var/run/casandra/.
>
> Aaron
>
>
> On Wed, Feb 9, 2022 at 9:53 AM Bowen Song  wrote:
>
>> Isn't the /etc/init.d/cassandra script supposed to create the PID file if
>> it doesn't exist? See:
>> https://github.com/apache/cassandra/blob/cb1c8f9d34edfa639096d2d122dfd0ee6d23b479/debian/init#L83
>> 
>>
>>
>> On 09/02/2022 15:17, Saha, Sushanta K wrote:
>>
>> I picked up the script */etc/init.d/cassandra* from the net. Not sure if
>> tarball installation includes such a script. It should.
>>
>> This script is using the following line:
>> *pid_file=/var/run/cassandra/cassandra.pid*
>>
>> But, there is no such *.pid* file that I can find. I am starting
>> Cassandra with $CASSANDRA_HOME/bin/cassandra.
>>
>> Appreciate any pointer for this auto stop and start of Cassandra.
>>
>> Thanks & Regards
>>  Sushanta
>>
>> --
>>
>> *Sushanta Saha|*MTS IV-Cslt-Sys Engrg|WebIaaS_DB Group|HQ -
>> * VerizonWireless O 770.797.1260  C 770.714.6555 Iaas Support Line
>> 949-286-8810*
>>
>>

-- 

*Sushanta Saha|*MTS IV-Cslt-Sys Engrg|WebIaaS_DB Group|HQ -
* VerizonWireless O 770.797.1260  C 770.714.6555 Iaas Support Line
949-286-8810*


Re: Running enablefullquerylog crashes cassandra

2022-02-09 Thread Erick Ramirez
Are there really no entries after those INFO messages? That indicates that
a person/script/daemon/tool/process killed Cassandra. Perhaps check the OS
logs to see if oom-killer kicked in to see if the C* process was
terminated. Cheers!

>