RE: Why is yum pulling in open JDK ?

2014-07-07 Thread Cox, Cory (Agoda)
I have had the same issue. Not an expert on this… but I think it is more a 
consequence of the CentOS repo than cassandra rpm. The Oracle JVM packages are 
not available and it appears you need to download (after accepting license) the 
rpm and use the rpm command to install the package. Wget is also problematic as 
the url appears to be littered with other html in the response… I had to 
download and scp to the box and then install Java BEFORE installing Casandra to 
avoid the dependency triggering an auto install of the openjdk.

Any repo experts please jump in…

Thanks,
[Cory M Cox Signature small]
Cory Cox
Senior Database Administrator
[http://sharepoint.agoda.local/PR/Communications/Agoda%20logo%20with%20slogan.png]
a Priceline ® company


From: Wim Deblauwe [mailto:wim.debla...@gmail.com]
Sent: Monday, July 07, 2014 13:50
To: user@cassandra.apache.org
Subject: Re: Why is yum pulling in open JDK ?

Hi,

I am very aware that Cassandra needs Java. I was just wondering why 'openjdk' 
is the dependency while it is advised to use the Oracle Java.

regards,

Wim

2014-07-06 21:54 GMT+02:00 Patricia Gorla 
mailto:patri...@thelastpickle.com>>:
Wim,

> openjdk

Java is a dependency of Cassandra, so if you do not have Java already installed 
on your computer, yum will automatically do so. The Oracle Java JVM must be 
installed separately.

> dsc20, cassandra20

The first installation target is for Datastax Community version 2.0, while the 
latter installs Apache Cassandra 2.0

Cheers,
--
Patricia Gorla
@patriciagorla

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com



This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.


Compaction causing listeners to stall

2014-07-07 Thread Bryon Spahn
Greetings all,



I am experiencing a strange issue where we run a compaction job weekly and
as a result, the listeners stall. This is a single node cluster running on
an i2.2xl instance in AWS. We are getting the message:



*[StorageServiceShutdownHook]*



and then:



*MessagingService has terminated the accept() thread*



The result is that the cluster will not accept any requests and must be
re-started to function again properly. Based on my research, this appears
to be the result of hitting a memory limit in the jvm. We have not
customized the memory settings from default and are using the calculated
results as defined in the environment script. We have disabled the weekly
compaction for now but would prefer to work out the cause.



Any help would be greatly appreciated!


RE: Why is yum pulling in open JDK ?

2014-07-07 Thread Michael Dykman
It comes down to licencing issues. Sun and now Oracle has always been very
particular about what they see as bundling.  While they have repos for
ubuntu, redhat,centos,suse, etc. they don't allow those repos to be
installed in standard distributions unless you are paying them a fee for
doing so. You, the system owner/admin are free to install it on your own
systems, as long as you acquire it from them, not your OS provider.

I have been doing java on linux for a long time and it has ever been a
pain. I still find important java artifacts in some distros which want to
depend on gcj. For this reason, while I am glad to maintain java itself
through an Oracle-provided ppa, I manage systems built on java without the
use of apt/yum/etc. I gave up long ago on the idea that sane java
integration was something that open platforms can provide as long as Oracle
keeps that part closed.
On Jul 7, 2014 6:25 AM, "Cox, Cory (Agoda)"  wrote:

>  I have had the same issue. Not an expert on this… but I think it is more
> a consequence of the CentOS repo than cassandra rpm. The Oracle JVM
> packages are not available and it appears you need to download (after
> accepting license) the rpm and use the rpm command to install the package.
> Wget is also problematic as the url appears to be littered with other html
> in the response… I had to download and scp to the box and then install Java
> BEFORE installing Casandra to avoid the dependency triggering an auto
> install of the openjdk.
>
>
>
> Any repo experts please jump in…
>
>
>
> Thanks,
>
> [image: Cory M Cox Signature small]
> Cory Cox
>
> Senior Database Administrator
>
> [image:
> http://sharepoint.agoda.local/PR/Communications/Agoda%20logo%20with%20slogan.png]
>
> a Priceline ® company
>
>
>
> *From:* Wim Deblauwe [mailto:wim.debla...@gmail.com]
> *Sent:* Monday, July 07, 2014 13:50
> *To:* user@cassandra.apache.org
> *Subject:* Re: Why is yum pulling in open JDK ?
>
>
>
> Hi,
>
>
>
> I am very aware that Cassandra needs Java. I was just wondering why
> 'openjdk' is the dependency while it is advised to use the Oracle Java.
>
>
>
> regards,
>
>
>
> Wim
>
>
>
> 2014-07-06 21:54 GMT+02:00 Patricia Gorla :
>
> Wim,
>
>
>
> > openjdk
>
>
>
> Java is a dependency of Cassandra, so if you do not have Java already
> installed on your computer, yum will automatically do so. The Oracle Java
> JVM must be installed separately.
>
>
>
> > dsc20, cassandra20
>
>
>
> The first installation target is for Datastax Community version 2.0, while
> the latter installs Apache Cassandra 2.0
>
>
>
> Cheers,
>
> --
>
> Patricia Gorla
>
> @patriciagorla
>
>
>
> Consultant
>
> Apache Cassandra Consulting
>
> http://www.thelastpickle.com 
>
>
>
> --
> This message is confidential and is for the sole use of the intended
> recipient(s). It may also be privileged or otherwise protected by copyright
> or other legal rules. If you have received it by mistake please let us know
> by reply email and delete it from your system. It is prohibited to copy
> this message or disclose its content to anyone. Any confidentiality or
> privilege is not waived or lost by any mistaken delivery or unauthorized
> disclosure of the message. All messages sent to and from Agoda may be
> monitored to ensure compliance with company policies, to protect the
> company's interests and to remove potential malware. Electronic messages
> may be intercepted, amended, lost or deleted, or contain viruses.
>


Controlling system.log rotation

2014-07-07 Thread Xavier Fustero
Hi,

I used to have system.log writing directly to syslog and configure a
rsyslog server to get all logs from my cassandra boxes. However, the java
stack traces are a headache on my server and I read on rsyslog forums to
change application to write to a file and let rsyslog to read from that
file (using imfile module).

This is what I have done. However, the /var/log/cassandra/system.log is
created as cassandra:cassandra 600. I would like to change the groupship to
syslog and have permissions like 640. I can do it but whenever the file is
rotated it starts again as cassandra:cassandra 600.

I can't find much information on that file but changing the rotation and
the size. E.g.:

log4j.appender.R.maxFileSize=50MB
log4j.appender.R.maxBackupIndex=50

Is there a way to control this?

Thanks,
Xavi


Re: Controlling system.log rotation

2014-07-07 Thread Ken Hancock
I think this essentially boils down the issue:

https://issues.apache.org/bugzilla/show_bug.cgi?id=40407

Seems the best way would be to change the umask for user cassandra:

http://stackoverflow.com/questions/7893511/permissions-on-log-files-created-by-log4j-rollingfileappender


Ken

On Mon, Jul 7, 2014 at 9:50 AM, Xavier Fustero 
wrote:

> Hi,
>
> I used to have system.log writing directly to syslog and configure a
> rsyslog server to get all logs from my cassandra boxes. However, the java
> stack traces are a headache on my server and I read on rsyslog forums to
> change application to write to a file and let rsyslog to read from that
> file (using imfile module).
>
> This is what I have done. However, the /var/log/cassandra/system.log is
> created as cassandra:cassandra 600. I would like to change the groupship to
> syslog and have permissions like 640. I can do it but whenever the file is
> rotated it starts again as cassandra:cassandra 600.
>
> I can't find much information on that file but changing the rotation and
> the size. E.g.:
>
> log4j.appender.R.maxFileSize=50MB
> log4j.appender.R.maxBackupIndex=50
>
> Is there a way to control this?
>
> Thanks,
> Xavi
>


Large SSTable not compacted with size tiered compaction

2014-07-07 Thread John Sanda
I have a write-heavy table that is using size tiered compaction. I am
running C* 1.2.9. There is an SSTable that is not getting compacted. It is
disproportionately larger than the other SSTables. The data file sizes are,

1.70 GB
0.18 GB
0.16 GB
0.05 GB
8.61 GB

If I set the bucket_high compaction property on the table to a sufficiently
large value, will the 8.61 GB get compacted? What if any drawbacks are
there to increasing the bucket_high property?

In what scenarios could I wind up with such a disproportionately large
SSTable like this? One thing that comes to mind is major compactions, but I
have not that.

- John


Re: Why is yum pulling in open JDK ?

2014-07-07 Thread Redmumba
The current RPM spec actually has a dependency on "java", which is not a
package--rather, it is a piece of metadata called "provides" that multiple
packages can share.  For example, Oracle's JVM, OpenJDK, ICedTea, etc.--can
all be used to fulfill the requirement for "java".

There is a reverse-engineered RPM spec of the DSE RPM spec here:
http://pastie.org/pastes/5191311/text

Relevant section:

Requires:  java >= 1.6.0

tl;dr: The Oracle JVM is not readily available, and so the DSE RPM uses
whichever package that satisfies the dependency "java >= 1.6.0".

Andrew


On Mon, Jul 7, 2014 at 5:51 AM, Michael Dykman  wrote:

> It comes down to licencing issues. Sun and now Oracle has always been very
> particular about what they see as bundling.  While they have repos for
> ubuntu, redhat,centos,suse, etc. they don't allow those repos to be
> installed in standard distributions unless you are paying them a fee for
> doing so. You, the system owner/admin are free to install it on your own
> systems, as long as you acquire it from them, not your OS provider.
>
> I have been doing java on linux for a long time and it has ever been a
> pain. I still find important java artifacts in some distros which want to
> depend on gcj. For this reason, while I am glad to maintain java itself
> through an Oracle-provided ppa, I manage systems built on java without the
> use of apt/yum/etc. I gave up long ago on the idea that sane java
> integration was something that open platforms can provide as long as Oracle
> keeps that part closed.
>  On Jul 7, 2014 6:25 AM, "Cox, Cory (Agoda)"  wrote:
>
>>  I have had the same issue. Not an expert on this… but I think it is
>> more a consequence of the CentOS repo than cassandra rpm. The Oracle JVM
>> packages are not available and it appears you need to download (after
>> accepting license) the rpm and use the rpm command to install the package.
>> Wget is also problematic as the url appears to be littered with other html
>> in the response… I had to download and scp to the box and then install Java
>> BEFORE installing Casandra to avoid the dependency triggering an auto
>> install of the openjdk.
>>
>>
>>
>> Any repo experts please jump in…
>>
>>
>>
>> Thanks,
>>
>> [image: Cory M Cox Signature small]
>> Cory Cox
>>
>> Senior Database Administrator
>>
>> [image:
>> http://sharepoint.agoda.local/PR/Communications/Agoda%20logo%20with%20slogan.png]
>>
>> a Priceline ® company
>>
>>
>>
>> *From:* Wim Deblauwe [mailto:wim.debla...@gmail.com]
>> *Sent:* Monday, July 07, 2014 13:50
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: Why is yum pulling in open JDK ?
>>
>>
>>
>> Hi,
>>
>>
>>
>> I am very aware that Cassandra needs Java. I was just wondering why
>> 'openjdk' is the dependency while it is advised to use the Oracle Java.
>>
>>
>>
>> regards,
>>
>>
>>
>> Wim
>>
>>
>>
>> 2014-07-06 21:54 GMT+02:00 Patricia Gorla :
>>
>> Wim,
>>
>>
>>
>> > openjdk
>>
>>
>>
>> Java is a dependency of Cassandra, so if you do not have Java already
>> installed on your computer, yum will automatically do so. The Oracle Java
>> JVM must be installed separately.
>>
>>
>>
>> > dsc20, cassandra20
>>
>>
>>
>> The first installation target is for Datastax Community version 2.0,
>> while the latter installs Apache Cassandra 2.0
>>
>>
>>
>> Cheers,
>>
>> --
>>
>> Patricia Gorla
>>
>> @patriciagorla
>>
>>
>>
>> Consultant
>>
>> Apache Cassandra Consulting
>>
>> http://www.thelastpickle.com 
>>
>>
>>
>> --
>> This message is confidential and is for the sole use of the intended
>> recipient(s). It may also be privileged or otherwise protected by copyright
>> or other legal rules. If you have received it by mistake please let us know
>> by reply email and delete it from your system. It is prohibited to copy
>> this message or disclose its content to anyone. Any confidentiality or
>> privilege is not waived or lost by any mistaken delivery or unauthorized
>> disclosure of the message. All messages sent to and from Agoda may be
>> monitored to ensure compliance with company policies, to protect the
>> company's interests and to remove potential malware. Electronic messages
>> may be intercepted, amended, lost or deleted, or contain viruses.
>>
>


Size-tiered Compaction runs out of memory

2014-07-07 Thread Redmumba
I am having an issue on multiple machines where it's simply filling up the
disk space during what I can only assume is a compaction.  For example, the
average node cluster-wide is around 900GB according to DSE
OpsCenter--however, after coming in after the three day weekend, I noticed
that there were 13 hosts that were down in the cluster.

When I investigated, I see several huge sstables that appeared to be in the
middle of compaction when the host ran out of usable disk space:

81G auditing-audit-jb-17623-Data.db
> 189Gauditing-audit-jb-19863-Data.db
> 182Gauditing-audit-jb-25298-Data.db
> 196Gauditing-audit-jb-30791-Data.db
> 13G auditing-audit-jb-31003-Data.db
> 12G auditing-audit-jb-31341-Data.db
> 12G auditing-audit-jb-31678-Data.db
> 12G auditing-audit-jb-32019-Data.db
> 766Mauditing-audit-jb-32039-Data.db
> 791Mauditing-audit-jb-32060-Data.db
> 199Mauditing-audit-jb-32065-Data.db
> 52M auditing-audit-jb-32066-Data.db
> 175Gauditing-audit-jb-8179-Data.db
>
> *643Gauditing-audit-tmp-jb-31207-Data.db32G
> auditing-audit-tmp-jb-32030-Data.db*
>

>From what I can tell, it is reaching the point where it is trying to
compact the ~190G ones into a combined bigger one, and is failing because
of the lack of disk space.  How do I work around this?

Would adjusting the maximum sstables before a compaction is performed help
this situation?  I am currently using the default values provided by
SizeTieredCompactionStrategy in C* 2.0.6.  Or is there a better option for
a continuous-write operation (with TTLs for dropping off old data)?

Thank you for your help!

Andrew


Re: Large SSTable not compacted with size tiered compaction

2014-07-07 Thread Ken Hancock
What are the timestamps on those SST Tables?  Do those tables use TTL?

To answer your last question, I've seen that scenario happen under load
testing with column families with TTL.  Large loads within the TTL window
cause normal compaction to build up larger and larger SST tables.  When the
load falls off there's a couple very large tables and under normal load
small tables get TTL'd before any table gets large enough to hit
min_threshold, so data that's months old and should have been TTL'd will
never get the chance.

There's a nice blog that covers the bucket_high and the algorithm -- yes, I
believe setting bucket_high large enough will cause the one large table to
be grouped with the others -- however, if you're not using TTL, I don't
think there's an issue -- small tables simply need to build up to get
another three medium-sized tables (1.7 GB) which then need to build up to
get 4 larger (8GB tables).

http://shrikantbang.wordpress.com/2014/04/22/size-tiered-compaction-strategy-in-apache-cassandra/


On Mon, Jul 7, 2014 at 12:13 PM, John Sanda  wrote:

> I have a write-heavy table that is using size tiered compaction. I am
> running C* 1.2.9. There is an SSTable that is not getting compacted. It is
> disproportionately larger than the other SSTables. The data file sizes are,
>
> 1.70 GB
> 0.18 GB
> 0.16 GB
> 0.05 GB
> 8.61 GB
>
> If I set the bucket_high compaction property on the table to a
> sufficiently large value, will the 8.61 GB get compacted? What if any
> drawbacks are there to increasing the bucket_high property?
>
> In what scenarios could I wind up with such a disproportionately large
> SSTable like this? One thing that comes to mind is major compactions, but I
> have not that.
>
> - John
>



-- 
*Ken Hancock *| System Architect, Advanced Advertising
SeaChange International
50 Nagog Park
Acton, Massachusetts 01720
ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC

Office: +1 (978) 889-3329 | [image: Google Talk:]
ken.hanc...@schange.com | [image:
Skype:]hancockks | [image: Yahoo IM:]hancockks [image: LinkedIn]


[image: SeaChange International]
 This e-mail and any attachments may contain
information which is SeaChange International confidential. The information
enclosed is intended only for the addressees herein and may not be copied
or forwarded without permission from SeaChange International.


Re: Large SSTable not compacted with size tiered compaction

2014-07-07 Thread Robert Coli
On Mon, Jul 7, 2014 at 9:13 AM, John Sanda  wrote:

> I have a write-heavy table that is using size tiered compaction. I am
> running C* 1.2.9. There is an SSTable that is not getting compacted. It is
> disproportionately larger than the other SSTables. The data file sizes are,
>
> 1.70 GB
> 0.18 GB
> 0.16 GB
> 0.05 GB
> 8.61 GB
>
> If I set the bucket_high compaction property on the table to a
> sufficiently large value, will the 8.61 GB get compacted? What if any
> drawbacks are there to increasing the bucket_high property?
>
> In what scenarios could I wind up with such a disproportionately large
> SSTable like this? One thing that comes to mind is major compactions, but I
> have not that.
>

First, it is very typical for there to be One Larger File in Size Tiered
Compaction.. in a slightly glib summary, that's what makes it Size Tiered.

As to causes of your specific case, row fragmentation such that old row
fragments are always in this file?

You could verify if this was the case by running a major compaction on the
CF. Given your small data sizes, you will almost certainly Just Win from
doing so.

However in your case, I would also start by checking that this file is
actually live. Cassandra around your version can sometimes leave spurious
dead SSTables in the data directory. Dead files don't get compacted.

=Rob


Re: Size-tiered Compaction runs out of memory

2014-07-07 Thread Robert Coli
On Mon, Jul 7, 2014 at 9:52 AM, Redmumba  wrote:

> I am having an issue on multiple machines where it's simply filling up the
> disk space during what I can only assume is a compaction.  For example, the
> average node cluster-wide is around 900GB according to DSE
> OpsCenter--however, after coming in after the three day weekend, I noticed
> that there were 13 hosts that were down in the cluster.
>
> When I investigated, I see several huge sstables that appeared to be in
> the middle of compaction when the host ran out of usable disk space:
>
>>
>> *643Gauditing-audit-tmp-jb-31207-Data.db32G
>> auditing-audit-tmp-jb-32030-Data.db*
>>
>
> From what I can tell, it is reaching the point where it is trying to
> compact the ~190G ones into a combined bigger one, and is failing because
> of the lack of disk space.  How do I work around this?
>

You mostly don't. Having enough headroom to compact is a part of Cassandra
capacity planning.

Are you doing a large amount of UPDATE or overwrite load which would lead
to very-efficient compaction [1]? If not, your problem isn't compaction,
it's that you have too much data per node.

The limited workaround available here is to :

1) stop automatic minor compactions (via the thresholds)
2) run userdefinedcompaction via JMX, combining only enough of the large
files to successfully compact

But 2) is not likely to be very useful unless your compactions are actually
reclaiming space. Probably "just" increase the size of your cluster.

=Rob
[1] grep % /path/to/system.log # are these percentages below 95%? if not,
compaction will not reclaim disk space when done!


Re: Size-tiered Compaction runs out of memory

2014-07-07 Thread Robert Coli
On Mon, Jul 7, 2014 at 9:52 AM, Redmumba  wrote:

> Would adjusting the maximum sstables before a compaction is performed help
> this situation?  I am currently using the default values provided by
> SizeTieredCompactionStrategy in C* 2.0.6.  Or is there a better option for
> a continuous-write operation (with TTLs for dropping off old data)?
>

(Sorry, just saw this line about the workload.)

Redis?

If you have a high write rate and are discarding 100% of data after a TTL,
perhaps a data-store with immutable data files that reconciles row
fragments on read is not ideal for your use case?

https://issues.apache.org/jira/browse/CASSANDRA-6654

Is probably also worth a read...

=Rob


Re: Compaction causing listeners to stall

2014-07-07 Thread Robert Coli
On Mon, Jul 7, 2014 at 5:20 AM, Bryon Spahn  wrote:

> I am experiencing a strange issue where we run a compaction job weekly and
> as a result, the listeners stall. This is a single node cluster running on
> an i2.2xl instance in AWS. We are getting the message:
>

There are almost no cases where it makes sense to run a single node of
Cassandra, especially in production.


> *[StorageServiceShutdownHook]*
>

I bet you a donut that you're OOMing the JVM. Stop doing that, and your
Cassandra node will stop crashing.

https://issues.apache.org/jira/browse/CASSANDRA-7507

Is probably the case you have just hit.

Basically, in some pathological circumstances, the JVM will send Cassandra
a signal that it handles as if you were an operator attempting a clean
shutdown. This probably usually does not succeed, but may be worth a shot.

=Rob


Re: RangeTombstone AIOOBE

2014-07-07 Thread Robert Coli
On Fri, Jul 4, 2014 at 4:14 AM, PRANEESH KUMAR 
wrote:

>
> We are running Cassandra 1.2.16 to store data using CQl with the following
> structure.
> ...
> Is this an known issue?
>

Not to me, at least!

If you are able to consistently reproduce the issue, you should :

1) search http://issues.apache.org "Cassandra" project for any relevant
issues
2) failing to find any in 1), file your own, including reproduction steps

=Rob


Re: DROP Table put Cassandra in an inconsistent state

2014-07-07 Thread Robert Coli
On Fri, Jul 4, 2014 at 2:31 AM, Simon Chemouil  wrote:

> I just encountered a bug with 2.1-rc1 (didn't have the chance to update
> to rc2 yet), and wondering if it's known or if I should report the issue
> on JIRA.
>

For issues in pre-release versions of Cassandra, JIRA is almost certainly
the right forum.

One can determine if an issue is "known" in JIRA by checking issues
associated with the next release.

=Rob


TTransportException (java.net.SocketException: Broken pipe)

2014-07-07 Thread Bhaskar Singhal
Hi,


I am using Cassandra 2.0.7 (with default settings and 16GB heap on quad core 
ubuntu server with 32gb ram) and trying to ingest 1MB values using 
cassandra-stress. It works fine for a while(1600secs) but after ingesting 
around 120GB data, I start getting the following error:
Operation [70668] retried 10 times - error inserting key 0070668 
((TTransportException): java.net.SocketException: Broken pipe)


The cassandra server is still running but in the system.log I see the below 
mentioned errors.


ERROR [COMMIT-LOG-ALLOCATOR] 2014-07-07 22:39:23,617 CassandraDaemon.java (line 
198) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main]
java.lang.NoClassDefFoundError: org/apache/cassandra/db/commitlog/CommitLog$4
    at 
org.apache.cassandra.db.commitlog.CommitLog.handleCommitError(CommitLog.java:374)
    at 
org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:116)
    at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
    at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.ClassNotFoundException: 
org.apache.cassandra.db.commitlog.CommitLog$4
    at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    ... 4 more
Caused by: java.io.FileNotFoundException: 
/path/2.0.7/cassandra/build/classes/main/org/apache/cassandra/db/commitlog/CommitLog$4.class
 (Too many open files)
    at java.io.FileInputStream.open(Native Method)
    at java.io.FileInputStream.(FileInputStream.java:146)
    at 
sun.misc.URLClassPath$FileLoader$1.getInputStream(URLClassPath.java:1086)
    at sun.misc.Resource.cachedInputStream(Resource.java:77)
    at sun.misc.Resource.getByteBuffer(Resource.java:160)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:436)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
    ... 10 more
ERROR [FlushWriter:7] 2014-07-07 22:39:24,924 CassandraDaemon.java (line 198) 
Exception in thread Thread[FlushWriter:7,5,main]
FSWriteError in 
/cassandra/data4/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-593-Filter.db
    at 
org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.close(SSTableWriter.java:475)
    at 
org.apache.cassandra.io.util.FileUtils.closeQuietly(FileUtils.java:212)
    at 
org.apache.cassandra.io.sstable.SSTableWriter.abort(SSTableWriter.java:301)
    at 
org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:417)
    at 
org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:350)
    at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
    at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.FileNotFoundException: 
/cassandra/data4/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-593-Filter.db 
(Too many open files)
    at java.io.FileOutputStream.open(Native Method)
    at java.io.FileOutputStream.(FileOutputStream.java:221)
    at java.io.FileOutputStream.(FileOutputStream.java:110)
    at 
org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.close(SSTableWriter.java:466)
    ... 9 more


There are around 9685 open files by the Cassandra server process (using lsof), 
3938 commit log segments in /cassandra/commitlog and around 572 commit log 
segments deleted during the course of the test.

I am wondering what is causing Cassandra to open so many files, is the flushing 
slow? or something else?

I tried increasing the flush writers, but that didn't help. 



Regards,
Bhaskar


CREATE KEYSPACE "Keyspace1" WITH replication = {
  'class': 'SimpleStrategy',
  'replication_factor': '1'
};

CREATE TABLE "Standard1" (
  key blob,
  "C0" blob,
  PRIMARY KEY (key)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.01 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.00 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.10 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='NONE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={};