Problem when starting Cassandra 1.1.5

2012-10-08 Thread Thierry Templier

Hello,

I would want to upgrade Cassandra to version 1.1.5 but I have a problem 
when trying to start this version:


$ ./cassandra -f
xss =  -ea -javaagent:./../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities 
-XX:ThreadPriorityPolicy=42 -Xms1024M -Xmx1024M -Xmn256M 
-XX:+HeapDumpOnOutOfMemoryError -Xss180k

Segmentation fault

Here is the Java version I used:

$ java -version
java version "1.6.0_24"
OpenJDK Runtime Environment (IcedTea6 1.11.4) (6b24-1.11.4-1ubuntu0.10.04.1)
OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)

Thanks very much for your help!
Thierry


Re: Problem when starting Cassandra 1.1.5

2012-10-08 Thread Adeel Akbar

Please upgrade the JAVA with 1.7.X then it will be working.


Thanks & Regards

*Adeel**Akbar*

On 10/8/2012 1:36 PM, Thierry Templier wrote:

Hello,

I would want to upgrade Cassandra to version 1.1.5 but I have a 
problem when trying to start this version:


$ ./cassandra -f
xss =  -ea -javaagent:./../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities 
-XX:ThreadPriorityPolicy=42 -Xms1024M -Xmx1024M -Xmn256M 
-XX:+HeapDumpOnOutOfMemoryError -Xss180k

Segmentation fault

Here is the Java version I used:

$ java -version
java version "1.6.0_24"
OpenJDK Runtime Environment (IcedTea6 1.11.4) 
(6b24-1.11.4-1ubuntu0.10.04.1)

OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)

Thanks very much for your help!
Thierry






Cassandra 1.1.4 performance issue

2012-10-08 Thread Adeel Akbar

Hi,

We're running a small Cassandra cluster (1.1.4) with two nodes and 
serving data to our Web and Java application. After up-gradation of 
Cassandra from 1.0.8 to 1.1.4, we're starting to see some weird issues.


If we run 'ring' command from second node, its show that failed to 
connect 7199 of node 1.


$ /opt/apache-cassandra-1.1.4/bin/nodetool -h XX.XX.XX.01  ring
Failed to connect to 'XX.XX.XX.01:7199': Connection refused

We're using Network Monitoring System and Monit to monitor the servers, 
and in NMS the average CPU usage is around increased upto 500%, on our 
quad-core Xeon servers with 16 GB RAM. But occasionally through Monit we 
can see that the 1-min load average goes above 7. Is this common? Does 
this happen to everyone else? And why the spikiness in load? We can't 
find anything in the cassandra logs indicating that something's up (such 
as a slow GC or compaction), and there's no corresponding traffic spike 
in the application either. Should we just add more nodes if any single 
one gets CPU spikes?


Another explanation could also be that we've configured it wrong. We're 
running pretty much default config and each node has 16G of RAM.


A single keyspace with 15 to 20 column families, RF=2, and we have 260 
GB of actual data. Please find below top and I/O stats for further 
reference;


top - 14:21:51 up 29 days,  9:52,  1 user,  load average: 6.59, 3.16, 1.42
Tasks: 163 total,   2 running, 161 sleeping,   0 stopped,   0 zombie
Cpu0  : 29.0%us,  0.0%sy,  0.0%ni, 71.0%id,  0.0%wa,  0.0%hi, 0.0%si,  
0.0%st
Cpu1  : 28.0%us,  0.0%sy,  0.0%ni, 72.0%id,  0.0%wa,  0.0%hi, 0.0%si,  
0.0%st
Cpu2  : 13.3%us,  0.0%sy,  0.0%ni, 86.7%id,  0.0%wa,  0.0%hi, 0.0%si,  
0.0%st
Cpu3  : 23.5%us,  0.7%sy,  0.0%ni, 75.5%id,  0.0%wa,  0.0%hi, 0.3%si,  
0.0%st
Cpu4  : 89.4%us,  0.3%sy,  0.0%ni, 10.0%id,  0.0%wa,  0.0%hi, 0.3%si,  
0.0%st
Cpu5  : 29.2%us,  0.0%sy,  0.0%ni, 70.8%id,  0.0%wa,  0.0%hi, 0.0%si,  
0.0%st
Cpu6  : 25.1%us,  0.0%sy,  0.0%ni, 74.9%id,  0.0%wa,  0.0%hi, 0.0%si,  
0.0%st
Cpu7  : 24.3%us,  0.0%sy,  0.0%ni, 72.0%id,  0.0%wa,  2.3%hi, 1.3%si,  
0.0%st

Mem:  16427844k total, 16317416k used,   110428k free,   128824k buffers
Swap:0k total,0k used,0k free, 11344696k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+ COMMAND
 5284 root  18   0  265g 7.7g 3.6g S 266.6 49.0 474:24.38 java -ea 
-javaagent:/opt/apache-cassandra-1.1.4/bin/../lib/jamm-0.2.5.jar 
-XX:+UseThreadPriorities -XX:Thr

1 root  15   0 10368  660  548 S  0.0  0.0   0:01.64 init [3]

# iostat -xmn 2 10
-x and -n options are mutually exclusive

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   9.770.030.540.980.00   88.68

Device: rrqm/s   wrqm/s   r/s   w/srMB/swMB/s avgrq-sz 
avgqu-sz   await  svctm  %util
sda   0.59 3.97  5.54  0.42 0.20 0.02 75.52 
0.11   19.10   3.55   2.11
sda1  0.00 0.00  0.01  0.00 0.00 0.00 88.69 
0.001.36   1.31   0.00
sda2  0.59 3.97  5.53  0.42 0.20 0.02 75.51 
0.11   19.12   3.55   2.11
sdb   1.54 7.82 10.39  0.64 0.28 0.03 57.77 
0.36   32.61   4.27   4.70
sdb1  1.54 7.82 10.39  0.64 0.28 0.03 57.77 
0.36   32.61   4.27   4.70
dm-0  0.00 0.00  1.73  0.62 0.02 0.00 19.27 
0.026.75   0.90   0.21
dm-1  0.00 0.00 16.32 12.23 0.46 0.05 36.47 
0.50   17.67   2.07   5.92
dm-2  0.00 0.00  0.00  0.00 0.00 0.00 8.00 
0.007.10   3.41   0.00


Device:   rMB_nor/swMB_nor/srMB_dir/s 
wMB_dir/srMB_svr/swMB_svr/s ops/srops/swops/s


avg-cpu:  %user   %nice %system %iowait  %steal   %idle
  12.460.000.000.190.00   87.35

Device: rrqm/s   wrqm/s   r/s   w/srMB/swMB/s avgrq-sz 
avgqu-sz   await  svctm  %util
sda   0.00 2.50  0.00  1.00 0.00 0.01 28.00 
0.000.00   0.00   0.00
sda1  0.00 0.00  0.00  0.00 0.00 0.00 0.00 
0.000.00   0.00   0.00
sda2  0.00 2.50  0.00  1.00 0.00 0.01 28.00 
0.000.00   0.00   0.00
sdb   0.00 4.50  0.50  1.50 0.00 0.02 28.00 
0.016.00   6.00   1.20
sdb1  0.00 4.50  0.50  1.50 0.00 0.02 28.00 
0.016.00   6.00   1.20
dm-0  0.00 0.00  0.50  4.50 0.00 0.02 8.80 
0.048.00   2.40   1.20
dm-1  0.00 0.00  0.00  5.00 0.00 0.02 8.00 
0.000.00   0.00   0.00
dm-2  0.00 0.00  0.00  0.00 0.00 0.00 0.00 
0.000.00   0.00   0.00


Device:   rMB_nor/swMB_nor/srMB_dir/s 
wMB_dir/srMB_svr/swMB_svr/s ops/srops/swops/s


avg-cpu:  %user   %nice %system %iowait  %steal   %idle
  12.520.0

Re: Problem when starting Cassandra 1.1.5

2012-10-08 Thread Thierry Templier

Thanks very much, Adeel! It works much better!

Thierry

Please upgrade the JAVA with 1.7.X then it will be working.


Thanks & Regards

*Adeel**Akbar*

On 10/8/2012 1:36 PM, Thierry Templier wrote:

Hello,

I would want to upgrade Cassandra to version 1.1.5 but I have a 
problem when trying to start this version:


$ ./cassandra -f
xss =  -ea -javaagent:./../lib/jamm-0.2.5.jar 
-XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1024M 
-Xmx1024M -Xmn256M -XX:+HeapDumpOnOutOfMemoryError -Xss180k

Segmentation fault

Here is the Java version I used:

$ java -version
java version "1.6.0_24"
OpenJDK Runtime Environment (IcedTea6 1.11.4) 
(6b24-1.11.4-1ubuntu0.10.04.1)

OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)

Thanks very much for your help!
Thierry








Re: 1000's of CF's.

2012-10-08 Thread Vanger
So what solution should be for cassandra architecture when we need to make 
Hadoop M\R jobs and not be restricted by number of CF?
What we have now is fair amount of CFs  (> 2K) and this number is slowly 
growing so we already planing to merge partitioned CFs. But our next goal is to 
run hadoop tasks on those CFs. All we have is plain Hector and custom ORM on 
top of it. As far as i understand VirtualKeyspace doesn't help in our case. 
Also i dont understand why not implement support for many CF ( or build-in  
partitioning ) on cassandra side. Anybody can explain why this can or cannot be 
done in cassandra?

Just in case:
We're using cassandra 1.0.11 on 30 nodes (planning upgrade on 1.1.* soon).

--
W/ best regards, 
Sergey.

On 04.10.2012 0:10, Hiller, Dean wrote:
> Okay, so it only took me two solid days not a week.  PlayOrm in master branch 
> now supports virtual CF's or virtual tables in ONE CF, so you can have 1000's 
> or millions of virtual CF's in one CF now.  It works with all the 
> Scalable-SQL, works with the joins, and works with the PlayOrm command line 
> tool.
> 
> Two ways to do it, if you are using the ORM half, you just annotate
> 
> @NoSqlEntity("MyVirtualCfName")
> @NoSqlVirtualCf(storedInCf="sharedCf")
> 
> So it's stored in sharedCf with the table name of MyVirtualCfName(in command 
> line tool, use MyVirtualCfName to query the table).
> 
> Then if you don't know your meta data ahead of time, you need to create 
> DboTableMeta and DboColumnMeta objects and save them for every table you 
> create and can use TypedRow to read and persist (which is what we have a 
> project doing).
> 
> If you try it out let me know.  We usually get bug fixes in pretty fast if 
> you run into anything.  (more and more questions are forming on stack 
> overflow as well ;) ).
> 
> Later,
> Dean
> 
> 




Problem while streaming SSTables with BulkOutputFormat

2012-10-08 Thread Ralph Romanos

Hello,
I am using BulkOutputFormat to load data from a .csv file into Cassandra. I am 
using Cassandra 1.1.3 and Hadoop 0.20.2.I have 7 hadoop nodes: 1 
namenode/jobtracker and 6 datanodes/tasktrackers. Cassandra is installed on 4 
of these 6 datanodes/tasktrackers.The issue happens when I have more than 1 
reducer, SSTables are generated in each node, however, I get the following 
error in the tasktracker's logs when they are streamed into the Cassandra 
cluster:
Exception in thread "Streaming to /172.16.110.79:1" java.lang.RuntimeException: 
java.io.EOFException
at 
org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:628)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(Unknown Source)
at 
org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:194)
at 
org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:181)
at 
org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:94)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more
Exception in thread "Streaming to /172.16.110.92:1" java.lang.RuntimeException: 
java.io.EOFException
at 
org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:628)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(Unknown Source)
at 
org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:194)
at 
org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:181)
at 
org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:94)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more ...
This is what I get in the logs of one of my Cassandra nodes:ERROR 16:47:34,904 
Sending retry message failed, closing session.
java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcher.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(Unknown Source)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
at sun.nio.ch.IOUtil.write(Unknown Source)
at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
at java.nio.channels.Channels.writeFullyImpl(Unknown Source)
at java.nio.channels.Channels.writeFully(Unknown Source)
at java.nio.channels.Channels.access$000(Unknown Source)
at java.nio.channels.Channels$1.write(Unknown Source)
at java.io.OutputStream.write(Unknown Source)
at java.nio.channels.Channels$1.write(Unknown Source)
at java.io.DataOutputStream.writeInt(Unknown Source)
at 
org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:196)
at 
org.apache.cassandra.streaming.StreamInSession.sendMessage(StreamInSession.java:171)
at 
org.apache.cassandra.streaming.StreamInSession.retry(StreamInSession.java:160)
at 
org.apache.cassandra.streaming.IncomingStreamReader.retry(IncomingStreamReader.java:168)
at 
org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:98)
at 
org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:182)
at 
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:78)

Does anyone know what caused these errors?
Thank you for your help.Regards,Ralph 

Re: MBean cassandra.db.CompactionManager TotalBytesCompacted counts backwards

2012-10-08 Thread Bryan Talbot
I'm attempting to plot how "busy" the node is doing compactions but there
seems to only be a few metrics reported that might be suitable:
CompletedTasks, PendingTasks, TotalBytesCompacted,
TotalCompactionsCompleted.

It's not clear to me what the difference between CompletedTasks and
TotalCompactionsCompleted is but I am plotting TotalCompactionsCompleted /
sec as one metric; however, this rate is nearly always less than 1 and
doesn't capture how much resources are used doing the compaction.  A
compaction of 4 smallest SSTables counts the same as a compaction of 4
largest SSTables but the cost is hugely different.  Thus, I'm also plotting
TotalBytesCompacted / sec.

Since the TotalBytesCompacted value sometimes moves backwards I'm not
confident that it's reporting what it is meant to report.  The code and
comments indicate that it should only be incremented by the final size of
the newly created SSTable or by the bytes-compacted-so-far for a larger
compaction, so I don't see why it should be reasonable for it to sometimes
decrease.

How should the impact of compaction be measured if not by bytes compacted?

-Bryan


On Sun, Oct 7, 2012 at 7:39 AM, Edward Capriolo wrote:

> I have not looked at this JMX object in a while, however the
> compaction manager can support multiple threads. Also it moves from
> 0-filesize each time it has to compact a set of files.
>
> That is more useful for showing current progress rather then lifetime
> history.
>
>
>
> On Fri, Oct 5, 2012 at 7:27 PM, Bryan Talbot 
> wrote:
> > I've recently added compaction rate (in bytes / second) to my monitors
> for
> > cassandra and am seeing some odd values.  I wasn't expecting the values
> for
> > TotalBytesCompacted to sometimes decrease from one reading to the next.
>  It
> > seems that the value should be monotonically increasing while a server is
> > running -- obviously it would start again at 0 when the server is
> restarted
> > or if the counter rolls over (unlikely for a 64 bit long).
> >
> > Below are two samples taken 60 seconds apart: the value decreased by
> > 2,954,369,012 between the two readings.
> >
> > reported_metric=[timestamp:1349476449, status:200,
> > request:[mbean:org.apache.cassandra.db:type=CompactionManager,
> > attribute:TotalBytesCompacted, type:read], value:7548675470069]
> >
> > previous_metric=[timestamp:1349476389, status:200,
> > request:[mbean:org.apache.cassandra.db:type=CompactionManager,
> > attribute:TotalBytesCompacted, type:read], value:7551629839081]
> >
> >
> > I briefly looked at the code for CompactionManager and a few related
> classes
> > and don't see anyplace that is performing subtraction explicitly;
> however,
> > there are many additions of signed long values that are not validated and
> > could conceivably contain a negative value thus causing the
> > totalBytesCompacted to decrease.  It's interesting to note that the all
> of
> > the differences I've seen so far are more than the overflow value of a
> > signed 32 bit value.  The OS (CentOS 5.7) and sun java vm (1.6.0_29) are
> > both 64 bit.  JNA is enabled.
> >
> > Is this expected and normal?  If so, what is the correct interpretation
> of
> > this metric?  I'm seeing the negatives values a few times per hour when
> > reading it once every 60 seconds.
> >
> > -Bryan
> >
>



-- 
Bryan Talbot
Architect / Platform team lead, Aeria Games and Entertainment
Silicon Valley | Berlin | Tokyo | Sao Paulo


Re: RandomPartitioner and the token limits

2012-10-08 Thread aaron morton
AFAIK in the code the minimum exclusive value token is -1, so as a signed 
integer the maxmium value is 2**127

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 4/10/2012, at 3:19 AM, Carlos Pérez Miguel  wrote:

> Hello,
> 
> Reading the wiki of operations
> (http://wiki.apache.org/cassandra/Operations) I noticed something
> strange. When using RandomPartitioner, tokens are integers in the
> range [0,2**127] (both limits included) but keys are converted into
> this range using MD5. MD5 has 128 bits, so, tokens should not be in
> the range [0, (2**128)-1]?
> 
> Anyway, if Cassandra uses only 127 bits of that 128 bits because it
> tries to convert this 128 bit into a signed int, tokens should not be
> in the range [0, 2**127) (first limit included, last not included)?
> 
> Thank you
> 
> Carlos Pérez Miguel



Re: Regarding Row Cache configuration and non-heap memory

2012-10-08 Thread aaron morton
> In short the question is whether the row_cache_size_in_mb can exceed the heap 
> setting for cassandra 1.1.4 if jna.jar is present in the libs? 
Yes. 
AFAIK jna.jar is not required for off heap row cache in 1.1.X

> My heap settings are 8G and new heap size is 1600M. 
You can reduce the size of the heap in 1.1X. The default settings max out of 
4G. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 4/10/2012, at 3:21 PM, Ananth Gundabattula  wrote:

> Hello,
> 
> I have configured cassandra 1.1.4 to use row cache of 10GB ( the RAM on the 
> machine is pretty big and hence the row cache size is high).  My heap 
> settings are 8G and new heap size is 1600M. 
> 
> As I read from the forum and documentation, jna.jar allows to use non-heap 
> memory for the row caches. 
> 
> The question I have is how is the configuration in cassandra.yaml for 
> row_cache_size_in_mb interpreted? Is it referring to the non-heap setting or 
> the memory used inside the heap to maintain book-keeping information about 
> the non-heap memory ( as I gather from the postings that heap is indeed used 
> to some extent while still using the non-heap memory for row caches). 
> 
> In short the question is whether the row_cache_size_in_mb can exceed the heap 
> setting for cassandra 1.1.4 if jna.jar is present in the libs? 
> 
> Thanks for your time. 
> 
> Regards,
> Ananth 



Re: Importing sstable with Composite key? (without is working)

2012-10-08 Thread aaron morton
Not sure why you have two different definitions for the bars2 CF. 

You will need to create SSTable's that match the schema cassandra has. 

Cheers
 
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/10/2012, at 7:15 AM, T Akhayo  wrote:

> Good evening,
> 
> Today i managed to get a small cluster running of 2 computers. I also managed 
> to get my data model working and are able to import sstables created with 
> SSTableSimpleUnsortedWriter with sstableloader.
> 
> The only problem is when i try to use the composite key in my datamodel, 
> after i import my sstables and issue a simple select the cassandra crashes:
> ===
> ava.lang.IllegalArgumentException
> at java.nio.Buffer.limit(Unknown Source)
> at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51)
> at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60)
> at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:76)
> at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
> at java.util.TreeMap.put(Unknown Source)
> at 
> org.apache.cassandra.db.TreeMapBackedSortedColumns.addColumn(TreeMapBackedSortedColumns.java:95)
> at 
> org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:109)
> ...
> at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:108)
> at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:121)
> at 
> org.apache.cassandra.thrift.CassandraServer.execute_cql_query(CassandraServer.java:1237)
> at 
> org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3542)
> at 
> org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3530)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
> at 
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
> Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at java.lang.Thread.run(Unknown Source)
> ===
> 
> Now i can get everything running again by removing the data directories on 
> both nodes.
> 
> I suspect cassandra crashes because the sstable that is being imported has a 
> different schema when it comes to composite key (without composite key import 
> works fine).
> 
> My schema with composite key is:
> ===
> create table bars2(
> id uuid,
> timeframe int,
> datum timestamp,
> open double,
> high double,
> low double,
> close double,
> bartype int,
> PRIMARY KEY (timeframe, datum)
> );
> ===
> create column family bars2
>   with column_type = 'Standard'
>   and comparator = 
> 'CompositeType(org.apache.cassandra.db.marshal.DateType,org.apache.cassandra.db.marshal.UTF8Type)'
>   and default_validation_class = 'UTF8Type'
>   and key_validation_class = 'Int32Type'
>   and read_repair_chance = 0.1
>   and dclocal_read_repair_chance = 0.0
>   and gc_grace = 864000
>   and min_compaction_threshold = 4
>   and max_compaction_threshold = 32
>   and replicate_on_write = true
>   and compaction_strategy = 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
>   and caching = 'KEYS_ONLY'
>   and compression_options = {'sstable_compression' : 
> 'org.apache.cassandra.io.compress.SnappyCompressor'};
> ===
> 
> My code to create the sstable is (only the interested parts):
> ===
> sstWriter = new SSTableSimpleUnsortedWriter(new 
> File("c:\\cassandra\\newtables\\"), new RandomPartitioner(), "readtick",
> "bars2", UTF8Type.instance, null, 64);
> 
> 
> CompositeType.Builder cb=new 
> CompositeType.Builder(CompositeType.getInstance(compositeList));
> cb.add( bytes(curMinuteBar.getDatum().getTime()));
> cb.add(bytes(1));
> sstWriter.newRow(cb.build());
> 
> (... add columns...)
> ===
> 
> I highly suspect that the problem can be at 2 locations:
> - In the SSTableSimpleUnsortedWriter i use a UTF8Type.instance as comparator, 
> i'm not sure if that is right with a composite key?
> - When calling "sstWriter.newRow" i use "CompositeType.Builder" to build the 
> composite key, i'm not sure if i'm doing this the right way? (i did try 
> different combinations)
> 
> Does somebody know how i can continue on my journey?
> 



Re: Why data is not even distributed.

2012-10-08 Thread aaron morton
This is an issue with using the BOP. 

If you are just starting out stick with the Random Partitioner. 

Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/10/2012, at 10:33 AM, Andrey Ilinykh  wrote:

> It was my first thought.
> Then I md5 uuid and used the digest as a key:
> 
> MessageDigest md = MessageDigest.getInstance("MD5");
> 
> //in the loop
> UUID uuid = UUID.randomUUID();
> byte[] bytes = md.digest(asByteArray(uuid));
> 
> the result is exactly the same, first node takes 66%, second 33% and
> third one is empty. for some reason rows which should be placed on
> third node moved to first one.
> 
> Address DC  RackStatus State   Load
> Effective-Ownership Token
> 
> 
> Token(bytes[56713727820156410577229101238628035242])
> 127.0.0.1   datacenter1 rack1   Up Normal  7.68 MB
> 33.33%  Token(bytes[00])
> 127.0.0.3   datacenter1 rack1   Up Normal  79.17 KB
> 33.33%
> Token(bytes[0113427455640312821154458202477256070485])
> 127.0.0.2   datacenter1 rack1   Up Normal  3.81 MB
> 33.33%
> Token(bytes[56713727820156410577229101238628035242])
> 
> 
> 
> On Thu, Oct 4, 2012 at 12:33 AM, Tom  wrote:
>> Hi Andrey,
>> 
>> while the data values you generated might be following a true random
>> distribution, your row key, UUID, is not (because it is created on the same
>> machines by the same software with a certain window of time)
>> 
>> For example, if you were using the UUID class in Java, these would be
>> composed from several components (related to dimensions such as time and
>> version), so you can not expect a random distribution over the whole space.
>> 
>> 
>> Cheers
>> Tom
>> 
>> 
>> 
>> 
>> On Wed, Oct 3, 2012 at 5:39 PM, Andrey Ilinykh  wrote:
>>> 
>>> Hello, everybody!
>>> 
>>> I'm observing very strange behavior. I have 3 node cluster with
>>> ByteOrderPartitioner. (I run 1.1.5)
>>> I created a key space with replication factor of 1.
>>> Then I created one column family and populated it with random data.
>>> I use UUID as a row key, and Integer as a column name.
>>> Row keys were generated as
>>> 
>>> UUID uuid = UUID.randomUUID();
>>> 
>>> I populated about 10 rows with 100 column each.
>>> 
>>> I would expect equal load on each node, but the result is totally
>>> different. This is what nodetool gives me:
>>> 
>>> Address DC  RackStatus State   Load
>>> Effective-Ownership Token
>>> 
>>> 
>>> Token(bytes[56713727820156410577229101238628035242])
>>> 127.0.0.1   datacenter1 rack1   Up Normal  27.61 MB
>>> 33.33%  Token(bytes[00])
>>> 127.0.0.3   datacenter1 rack1   Up Normal  206.47 KB
>>> 33.33%
>>> Token(bytes[0113427455640312821154458202477256070485])
>>> 127.0.0.2   datacenter1 rack1   Up Normal  13.86 MB
>>> 33.33%
>>> Token(bytes[56713727820156410577229101238628035242])
>>> 
>>> 
>>> one node (127.0.0.3) is almost empty.
>>> Any ideas what is wrong?
>>> 
>>> 
>>> Thank you,
>>>  Andrey
>> 
>> 



Re: Query over secondary indexes

2012-10-08 Thread aaron morton
> get User where user_name = 'Vivek', it is taking ages to retrieve that data. 
> Is there anything i am doing wrong?
> 

How long is ages and how many nodes do you have?
Are there any errors in server logs ?

When you do a get by secondary index at a CL higher than ONE ever RFth node is 
involved. 

Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/10/2012, at 10:20 PM, Vivek Mishra  wrote:

> Thanks Rishabh. But i want to search over duplicate columns only.
> 
> -Vivek
> 
> On Fri, Oct 5, 2012 at 2:45 PM, Rishabh Agrawal 
>  wrote:
> Try making user_name a primary key in combination with some other unique 
> column and see if results are improving.
> 
> -Rishabh
> 
> From: Vivek Mishra [mailto:mishra.v...@gmail.com] 
> Sent: Friday, October 05, 2012 2:35 PM
> To: user@cassandra.apache.org
> Subject: Query over secondary indexes
> 
>  
> I have a column family "User" which is having a indexed column "user_name". 
> My schema is having around 0.1 million records only and user_name is 
> duplicated  across all rows.
> 
> Now when i am trying to retrieve it as:
> 
> get User where user_name = 'Vivek', it is taking ages to retrieve that data. 
> Is there anything i am doing wrong?
> 
> Also, i tried get_indexed_slices via Thrift API by setting  
> IndexClause.setCount(1), still  no luck, it got hang and not even returning a 
> single result. I believe 0.1 million is not a huge amount of data.
> 
> 
> Cassandra version : 1.1.2
> 
> Any idea?
> 
> 
> -Vivek
> 
> 
> 
> Impetus Ranked in the Top 50 India’s Best Companies to Work For 2012. 
> 
> Impetus webcast ‘Designing a Test Automation Framework for Multi-vendor 
> Interoperable Systems’ available at http://lf1.me/0E/. 
> 
> 
> NOTE: This message may contain information that is confidential, proprietary, 
> privileged or otherwise protected by law. The message is intended solely for 
> the named addressee. If received in error, please destroy and notify the 
> sender. Any use of this email is prohibited when received in error. Impetus 
> does not represent, warrant and/or guarantee, that the integrity of this 
> communication has been maintained nor that the communication is free of 
> errors, virus, interception or interference.
> 



READ messages dropped

2012-10-08 Thread Tamar Fraenkel
Hi!
In the last 3 days I see many messages of "READ messages dropped in last
5000ms" on one of my 3 nodes cluster.
I see no errors in the log.
There are also messages of "Finished hinted handoff of 0 rows to endpoint"
but I had those for a while now, so I don't know if they are related.
I am running Cassandra 1.0.8 on a 3 node cluster on EC2 m1.large instances.
Rep factor 3 (Quorum read and write)

Does anyone have a clue what I should be looking for, or how to solve it?
Thanks,

*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956
<>

Re: Question regarding hinted handoffs and restoring backup in cluster

2012-10-08 Thread aaron morton
If you are restoring the backup to get back to previous point in them, then you 
will want to remove all hints from the cluster. You will also want to stop 
recording them, IIRC the only way to do that is via a yaml config. 

If you are restoring the data to recover from some sort of loss, then keeping 
the hints in place is ok. 

Hope that helps. 

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/10/2012, at 12:30 AM, Fredrik  wrote:

> When restoring a backup for the entire cluster my understanding is that you 
> must shutdown the entire cluster and then restore the backup and then start 
> up all nodes again.
> http://www.datastax.com/docs/1.0/operations/backup_restore
> But how should I handle hinted handoffs (Hints CF). Since they're stored in 
> the system keyspace and according to the docs I only need to restore the 
> specific keyspace not the system keyspace.
> Won't these hinted handoffs, which isn't based on the backup, be delivered 
> and applied as soon as one of the node which they're aimed for comes up and 
> thus be applied to the backuped data.
> What is the recomended way to handle this situation? Removing the hints cf 
> from the system tables before restart of the cluster nodes?
> 
> Regards
> /Fredrik



Re: rolling restart after gc_grace change

2012-10-08 Thread aaron morton
> Is it still an issue if you don't run a repair within gc_grace_seconds ?
There is a potential issue.

You want to make sure the tombstones are distributed to all replicas *before* 
gc_grace_seconds has expired. If they are not you can have a case where some 
replicas compact and purge their tombstone (essentially a hard delete), while 
one replica keeps the original value. The result is data returning from the 
dead. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/10/2012, at 2:54 AM, Oleg Dulin  wrote:

> What if gc_grace_seconds is pretty low, say 2 mins, what happens with 
> nodetool repair ?
> 
> That wiki page below points at a bug that has been fixed long ago. Is it 
> still an issue if you don't run a repair within gc_grace_seconds ?
> 
> On 2012-01-09 10:02:49 +, aaron morton said:
> 
> Nah, thats old style. 
> 
> gc_grace_seconds is a CF level setting now. Make the change with update 
> column family in the CLI or your favorite client. 
> 
> Cheers
>  
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 9/01/2012, at 9:33 PM, Igor wrote:
> Hi!
> 
> On 
> thehttp://wiki.apache.org/cassandra/Operations#Dealing_with_the_consequences_of_nodetool_repair_not_running_within_GCGraceSeconds
>  you can read:
> 
> "To minimize the amount of forgotten deletes, first increase GCGraceSeconds 
> across the cluster (rolling restart required)"
> 
> Rolling restart still required for 1.0.6?
> 
> 
> -- 
> Regards,
> Oleg Dulin
> NYC Java Big Data Engineer
> http://www.olegdulin.com/



Re: question about where clause of CQL update statement

2012-10-08 Thread aaron morton
What is the CF schema ?

>  Is it not possible to include a column in both the set clause and in the 
> where clause? And if it is not possible, how come?

Not sure. 

Looks like you are looking for a conditional update here. You know the row is 
at ID 1 and you only want to update if locked = 'false' ?

Not sure that's supported. But I'm not sure this is the right sort of error. 

Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/10/2012, at 11:30 AM, John Sanda  wrote:

> I am using CQL 3 and trying to execute the following,
> 
> UPDATE CHANGELOGLOCK SET LOCKED = 'true', LOCKEDBY = '10.11.8.242 
> (10.11.8.242)', LOCKGRANTED = '2012-10-05 16:58:01' WHERE ID = 1 AND LOCKED = 
> 'false';
> 
> 
> It gives me the error, Bad Request: PRIMARY KEY part locked found in SET 
> part. The primary key consists only of the ID column, but I do have a 
> secondary index on the locked column. Is it not possible to include a column 
> in both the set clause and in the where clause? And if it is not possible, 
> how come?
> 
> Thanks
> 
> - John



Re: Text searches and free form queries

2012-10-08 Thread aaron morton
>  It works pretty fast.
Cool. 

Just keep an eye out for how big the lucene token row gets. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 7/10/2012, at 2:57 AM, Oleg Dulin  wrote:

> So, what I ended up doing is this --
> 
> As I write my records into the main CF, I tokenize some fields that I want to 
> search on using Lucene and write an index into a separate CF, such that my 
> columns are a composite of:
> 
> luceneToken:record key
> 
> I can then search my records by doing a slice for each lucene token in the 
> search query and then do an intersection of the sets. It works pretty fast.
> 
> Regards,
> Oleg
> 
> On 2012-09-05 01:28:44 +, aaron morton said:
> 
> AFAIk if you want to keep it inside cassandra then DSE, roll your own from 
> scratch or start with https://github.com/tjake/Solandra . 
> 
> Outside of Cassandra I've heard of people using Elastic Search or Solr which 
> I *think* is now faster at updating the index. 
> 
> Hope that helps. 
> 
>  
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 4/09/2012, at 3:00 AM, Andrey V. Panov  wrote:
> Some one did search on Lucene, but for very fresh data they build search 
> index in memory so data become available for search without delays.
> 
> On 3 September 2012 22:25, Oleg Dulin  wrote:
> Dear Distinguished Colleagues:
> 
> 
> -- 
> Regards,
> Oleg Dulin
> NYC Java Big Data Engineer
> http://www.olegdulin.com/



Re: Why data is not even distributed.

2012-10-08 Thread Andrey Ilinykh
The problem was - I calculated 3 tokens for random partitioner but
used them with BOP, so nodes were not supposed to be loaded evenly.
That's ok, I got it.
But what I don't understand, why nodetool ring shows equal ownership.
This is an example:
I created small cluster with BOP and three tokens
00



then I put some random data which is nicely distributed:

Address DC  RackStatus State   Load
Effective-Ownership Token

Token(bytes[])
127.0.0.1   datacenter1 rack1   Up Normal  1.92 MB
33.33%  Token(bytes[00])
127.0.0.2   datacenter1 rack1   Up Normal  1.93 MB
33.33%  Token(bytes[])
127.0.0.3   datacenter1 rack1   Up Normal  1.99 MB
33.33%  Token(bytes[])

then I moved node 2 to 0100 and node 3 to 0200. Which
means node 1 owns almost everything.

Address DC  RackStatus State   Load
Effective-Ownership Token

Token(bytes[0200])
127.0.0.1   datacenter1 rack1   Up Normal  5.76 MB
33.33%  Token(bytes[00])
127.0.0.2   datacenter1 rack1   Up Normal  30.37 KB
33.33%  Token(bytes[0100])
127.0.0.3   datacenter1 rack1   Up Normal  25.78 KB
33.33%  Token(bytes[0200])


As you can see all data is located on node 1. But nodetool ring still
shows 33.33% for each node. No matter how I move nodes, it always
gives me 33.33%.

It looks like a bug for me.

Thank you,
  Andrey


Re: Query over secondary indexes

2012-10-08 Thread Vivek Mishra
It was on 1 node and there is no error in server logs.

-Vivek

On Tue, Oct 9, 2012 at 1:21 AM, aaron morton wrote:

> get User where user_name = 'Vivek', it is taking ages to retrieve that
>> data. Is there anything i am doing wrong?
>>
> How long is ages and how many nodes do you have?
> Are there any errors in server logs ?
>
> When you do a get by secondary index at a CL higher than ONE ever RFth
> node is involved.
>
> Cheers
>
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 5/10/2012, at 10:20 PM, Vivek Mishra  wrote:
>
> Thanks Rishabh. But i want to search over duplicate columns only.
>
> -Vivek
>
> On Fri, Oct 5, 2012 at 2:45 PM, Rishabh Agrawal <
> rishabh.agra...@impetus.co.in> wrote:
>
>>  Try making *user_name* a primary key in combination with some other
>> unique column and see if results are improving.
>>
>> -Rishabh
>>
>> *From:* Vivek Mishra [mailto:mishra.v...@gmail.com]
>> *Sent:* Friday, October 05, 2012 2:35 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* Query over secondary indexes
>>
>>
>> I have a column family "User" which is having a indexed column
>> "user_name". My schema is having around 0.1 million records only and
>> user_name is duplicated  across all rows.
>>
>> Now when i am trying to retrieve it as:
>>
>> get User where user_name = 'Vivek', it is taking ages to retrieve that
>> data. Is there anything i am doing wrong?
>>
>> Also, i tried get_indexed_slices via Thrift API by setting
>> IndexClause.setCount(1), still  no luck, it got hang and not even returning
>> a single result. I believe 0.1 million is not a huge amount of data.
>>
>>
>> Cassandra version : 1.1.2
>>
>> Any idea?
>>
>>
>> -Vivek
>>
>> --
>>
>> Impetus Ranked in the Top 50 India’s Best Companies to Work For 2012.
>>
>> Impetus webcast ‘Designing a Test Automation Framework for Multi-vendor
>> Interoperable Systems’ available at http://lf1.me/0E/.
>>
>>
>> NOTE: This message may contain information that is confidential,
>> proprietary, privileged or otherwise protected by law. The message is
>> intended solely for the named addressee. If received in error, please
>> destroy and notify the sender. Any use of this email is prohibited when
>> received in error. Impetus does not represent, warrant and/or guarantee,
>> that the integrity of this communication has been maintained nor that the
>> communication is free of errors, virus, interception or interference.
>>
>
>
>


Re: Query over secondary indexes

2012-10-08 Thread Vivek Mishra
I did wait for atleast 5 minutes before terminating it. Also sometimes it
results in server crash as well, though data volume is not very huge.

-Vivek

On Tue, Oct 9, 2012 at 7:05 AM, Vivek Mishra  wrote:

> It was on 1 node and there is no error in server logs.
>
> -Vivek
>
>
> On Tue, Oct 9, 2012 at 1:21 AM, aaron morton wrote:
>
>>   get User where user_name = 'Vivek', it is taking ages to retrieve that
>>> data. Is there anything i am doing wrong?
>>>
>> How long is ages and how many nodes do you have?
>> Are there any errors in server logs ?
>>
>> When you do a get by secondary index at a CL higher than ONE ever RFth
>> node is involved.
>>
>> Cheers
>>
>>
>>   -
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 5/10/2012, at 10:20 PM, Vivek Mishra  wrote:
>>
>> Thanks Rishabh. But i want to search over duplicate columns only.
>>
>> -Vivek
>>
>> On Fri, Oct 5, 2012 at 2:45 PM, Rishabh Agrawal <
>> rishabh.agra...@impetus.co.in> wrote:
>>
>>>  Try making *user_name* a primary key in combination with some other
>>> unique column and see if results are improving.
>>>
>>> -Rishabh
>>>
>>> *From:* Vivek Mishra [mailto:mishra.v...@gmail.com]
>>> *Sent:* Friday, October 05, 2012 2:35 PM
>>> *To:* user@cassandra.apache.org
>>> *Subject:* Query over secondary indexes
>>>
>>>
>>> I have a column family "User" which is having a indexed column
>>> "user_name". My schema is having around 0.1 million records only and
>>> user_name is duplicated  across all rows.
>>>
>>> Now when i am trying to retrieve it as:
>>>
>>> get User where user_name = 'Vivek', it is taking ages to retrieve that
>>> data. Is there anything i am doing wrong?
>>>
>>> Also, i tried get_indexed_slices via Thrift API by setting
>>> IndexClause.setCount(1), still  no luck, it got hang and not even returning
>>> a single result. I believe 0.1 million is not a huge amount of data.
>>>
>>>
>>> Cassandra version : 1.1.2
>>>
>>> Any idea?
>>>
>>>
>>> -Vivek
>>>
>>> --
>>>
>>> Impetus Ranked in the Top 50 India’s Best Companies to Work For 2012.
>>>
>>> Impetus webcast ‘Designing a Test Automation Framework for Multi-vendor
>>> Interoperable Systems’ available at http://lf1.me/0E/.
>>>
>>>
>>> NOTE: This message may contain information that is confidential,
>>> proprietary, privileged or otherwise protected by law. The message is
>>> intended solely for the named addressee. If received in error, please
>>> destroy and notify the sender. Any use of this email is prohibited when
>>> received in error. Impetus does not represent, warrant and/or guarantee,
>>> that the integrity of this communication has been maintained nor that the
>>> communication is free of errors, virus, interception or interference.
>>>
>>
>>
>>
>


Nodetool repair, exit code/status?

2012-10-08 Thread David Daeschler
Hello.

In the process of trying to streamline and provide better reporting
for various data storage systems, I've realized that although we're
verifying that nodetool repair runs, we're not verifying that it is
successful.

I found a bug relating to the exit code for nodetool repair, where, in
some situations, there is no way to verify the repair has completed
successfully: https://issues.apache.org/jira/browse/CASSANDRA-2666

Is this still a problem? What is the best way to monitor the final
status of the repair command to make sure all is well?


Thank you ahead of time for any info.
- David


Using Composite columns

2012-10-08 Thread Vivek Mishra
Hi,
I am trying to use compound primary key with cassandra and i am referring
to:
http://www.datastax.com/dev/blog/whats-new-in-cql-3-0

I have created a column family as:

CREATE TABLE altercations (
   instigator text,
   started_at timestamp,
   ships_destroyed int,
   energy_used float,
   alliance_involvement boolean,
   PRIMARY KEY (instigator, started_at)
   );


then tried:

cqlsh:testcomp> select * from altercations;

(gives me no results, Which looks fine).

Then i tried insert statement as:

 INSERT INTO altercations (instigator, started_at, ships_destroyed,
 energy_used, alliance_involvement)
 VALUES ('Jayne Cobb', '7943-07-23', 2, 4.6, 'false');

(Sucess with this)

Then again i tried:

cqlsh:testcomp> select * from altercations;

it is giving me an error:

[timestamp out of range for platform time_t]


I am able to get that work by changing '7943-07-23' to '2012-07-23'.

Just wanted to know, why cassandra does not complain for {timestamp
out of range for platform time_t} at the time of persisting it?


-Vivek


RE: Using compound primary key

2012-10-08 Thread Arindam Barua

Did you use the "--cql3" option with the cqlsh command?

From: Vivek Mishra [mailto:mishra.v...@gmail.com]
Sent: Monday, October 08, 2012 7:22 PM
To: user@cassandra.apache.org
Subject: Using compound primary key

Hi,

I am trying to use compound primary key column name and i am referring to:
http://www.datastax.com/dev/blog/whats-new-in-cql-3-0

As mentioned on this example, i tried to create a column family containing 
compound primary key (one or more) as:

 CREATE TABLE altercations (
   instigator text,
   started_at timestamp,
   ships_destroyed int,
   energy_used float,
   alliance_involvement boolean,
   PRIMARY KEY (instigator,started_at,ships_destroyed)
   );

And i am getting:

**
TSocket read 0 bytes
cqlsh:testcomp>
**


Then followed by insert and select statements giving me following errors:



cqlsh:testcomp>INSERT INTO altercations (instigator, started_at, 
ships_destroyed,
...  energy_used, alliance_involvement)
...  VALUES ('Jayne Cobb', '2012-07-23', 2, 
4.6, 'false');
TSocket read 0 bytes

cqlsh:testcomp> select * from altercations;
Traceback (most recent call last):
  File "bin/cqlsh", line 1008, in perform_statement
self.cursor.execute(statement, decoder=decoder)
  File "bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py", line 
117, in execute
response = self.handle_cql_execution_errors(doquery, prepared_q, compress)
  File "bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py", line 
132, in handle_cql_execution_errors
return executor(*args, **kwargs)
  File 
"bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py",
 line 1583, in execute_cql_query
self.send_execute_cql_query(query, compression)
  File 
"bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py",
 line 1593, in send_execute_cql_query
self._oprot.trans.flush()
  File 
"bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TTransport.py",
 line 293, in flush
self.__trans.write(buf)
  File 
"bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TSocket.py", 
line 117, in write
plus = self.handle.send(buff)
error: [Errno 32] Broken pipe

cqlsh:testcomp>





Any idea?  Is it a problem with CQL3 or with cassandra?

P.S: I did post same query on dev group as well to get a quick response.


-Vivek


Re: Using compound primary key

2012-10-08 Thread Vivek Mishra
Certainly. As these are available with cql3 only!
Example mentioned on datastax website is working fine, only difference is i
tried with a compound primary key with 3 composite columns in place of 2

-Vivek

On Tue, Oct 9, 2012 at 7:57 AM, Arindam Barua  wrote:

>  ** **
>
> Did you use the “--cql3” option with the cqlsh command?
>
> ** **
>
> *From:* Vivek Mishra [mailto:mishra.v...@gmail.com]
> *Sent:* Monday, October 08, 2012 7:22 PM
> *To:* user@cassandra.apache.org
>
> *Subject:* Using compound primary key
>
> ** **
>
> Hi,
>
> ** **
>
> I am trying to use compound primary key column name and i am referring to:
> 
>
> http://www.datastax.com/dev/blog/whats-new-in-cql-3-0
>
> ** **
>
> As mentioned on this example, i tried to create a column family containing
> compound primary key (one or more) as:
>
> ** **
>
>  CREATE TABLE altercations (
>
>instigator text,
>
>started_at timestamp,
>
>ships_destroyed int,
>
>energy_used float,
>
>alliance_involvement boolean,
>
>PRIMARY KEY (instigator,started_at,ships_destroyed)
>
>);
>
> ** **
>
> And i am getting:
>
> ** **
>
> **
>
> TSocket read 0 bytes
>
> cqlsh:testcomp> 
>
> **
>
> ** **
>
> ** **
>
> Then followed by insert and select statements giving me following errors:*
> ***
>
> ** **
>
>
> 
> 
>
> ** **
>
> cqlsh:testcomp>INSERT INTO altercations (instigator, started_at,
> ships_destroyed,
>
> ...  energy_used,
> alliance_involvement)
>
> ...  VALUES ('Jayne Cobb', '2012-07-23',
> 2, 4.6, 'false');
>
> TSocket read 0 bytes
>
> ** **
>
> cqlsh:testcomp> select * from altercations;
>
> Traceback (most recent call last):
>
>   File "bin/cqlsh", line 1008, in perform_statement
>
> self.cursor.execute(statement, decoder=decoder)
>
>   File "bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py",
> line 117, in execute
>
> response = self.handle_cql_execution_errors(doquery, prepared_q,
> compress)
>
>   File "bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py",
> line 132, in handle_cql_execution_errors
>
> return executor(*args, **kwargs)
>
>   File
> "bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py",
> line 1583, in execute_cql_query
>
> self.send_execute_cql_query(query, compression)
>
>   File
> "bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py",
> line 1593, in send_execute_cql_query
>
> self._oprot.trans.flush()
>
>   File
> "bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TTransport.py",
> line 293, in flush
>
> self.__trans.write(buf)
>
>   File
> "bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TSocket.py",
> line 117, in write
>
> plus = self.handle.send(buff)
>
> error: [Errno 32] Broken pipe
>
> ** **
>
> cqlsh:testcomp> 
>
> ** **
>
>
> 
> 
>
> ** **
>
> ** **
>
> ** **
>
> Any idea?  Is it a problem with CQL3 or with cassandra?
>
> ** **
>
> P.S: I did post same query on dev group as well to get a quick response.**
> **
>
> ** **
>
> ** **
>
> -Vivek
>


Re: Using compound primary key

2012-10-08 Thread Brian O'Neill
Hey Vivek,

The same thing happened to me the other day.  You may be missing a component in 
your compound key.

See this thread:
http://mail-archives.apache.org/mod_mbox/cassandra-dev/201210.mbox/%3ccajhhpg20rrcajqjdnf8sf7wnhblo6j+aofksgbxyxwcoocg...@mail.gmail.com%3E

I also wrote a couple blogs on it:
http://brianoneill.blogspot.com/2012/09/composite-keys-connecting-dots-between.html
http://brianoneill.blogspot.com/2012/10/cql-astyanax-and-compoundcomposite-keys.html

They've fixed this in the 1.2 beta, whereby it checks (at the thrift layer) to 
ensure you have the requisite number of components in the compound/composite 
key.

-brian


On Oct 8, 2012, at 10:32 PM, Vivek Mishra wrote:

> Certainly. As these are available with cql3 only! 
> Example mentioned on datastax website is working fine, only difference is i 
> tried with a compound primary key with 3 composite columns in place of 2
> 
> -Vivek
> 
> On Tue, Oct 9, 2012 at 7:57 AM, Arindam Barua  wrote:
>  
> 
> Did you use the “--cql3” option with the cqlsh command?
> 
>  
> 
> From: Vivek Mishra [mailto:mishra.v...@gmail.com] 
> Sent: Monday, October 08, 2012 7:22 PM
> To: user@cassandra.apache.org
> 
> 
> Subject: Using compound primary key
> 
>  
> 
> Hi,
> 
>  
> 
> I am trying to use compound primary key column name and i am referring to:
> 
> http://www.datastax.com/dev/blog/whats-new-in-cql-3-0
> 
>  
> 
> As mentioned on this example, i tried to create a column family containing 
> compound primary key (one or more) as:
> 
>  
> 
>  CREATE TABLE altercations (
> 
>instigator text,
> 
>started_at timestamp,
> 
>ships_destroyed int,
> 
>energy_used float,
> 
>alliance_involvement boolean,
> 
>PRIMARY KEY (instigator,started_at,ships_destroyed)
> 
>);
> 
>  
> 
> And i am getting:
> 
>  
> 
> **
> 
> TSocket read 0 bytes
> 
> cqlsh:testcomp> 
> 
> **
> 
>  
> 
>  
> 
> Then followed by insert and select statements giving me following errors:
> 
>  
> 
> 
> 
>  
> 
> cqlsh:testcomp>INSERT INTO altercations (instigator, started_at, 
> ships_destroyed,
> 
> ...  energy_used, 
> alliance_involvement)
> 
> ...  VALUES ('Jayne Cobb', '2012-07-23', 2, 
> 4.6, 'false');
> 
> TSocket read 0 bytes
> 
>  
> 
> cqlsh:testcomp> select * from altercations;
> 
> Traceback (most recent call last):
> 
>   File "bin/cqlsh", line 1008, in perform_statement
> 
> self.cursor.execute(statement, decoder=decoder)
> 
>   File "bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py", 
> line 117, in execute
> 
> response = self.handle_cql_execution_errors(doquery, prepared_q, compress)
> 
>   File "bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py", 
> line 132, in handle_cql_execution_errors
> 
> return executor(*args, **kwargs)
> 
>   File 
> "bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py",
>  line 1583, in execute_cql_query
> 
> self.send_execute_cql_query(query, compression)
> 
>   File 
> "bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py",
>  line 1593, in send_execute_cql_query
> 
> self._oprot.trans.flush()
> 
>   File 
> "bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TTransport.py",
>  line 293, in flush
> 
> self.__trans.write(buf)
> 
>   File 
> "bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TSocket.py",
>  line 117, in write
> 
> plus = self.handle.send(buff)
> 
> error: [Errno 32] Broken pipe
> 
>  
> 
> cqlsh:testcomp> 
> 
>  
> 
> 
> 
>  
> 
>  
> 
>  
> 
> Any idea?  Is it a problem with CQL3 or with cassandra?
> 
>  
> 
> P.S: I did post same query on dev group as well to get a quick response.
> 
>  
> 
>  
> 
> -Vivek
> 
> 

-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/



Re: Using compound primary key

2012-10-08 Thread Vivek Mishra
Hi Brian,
Thanks for these references. These will surly help as i am on my way to get
them integrate with-in Kundera.

Surprisingly Column family itself was not created with example i was trying.

Thanks again,
-Vivek

On Tue, Oct 9, 2012 at 8:33 AM, Brian O'Neill  wrote:

> Hey Vivek,
>
> The same thing happened to me the other day.  You may be missing a
> component in your compound key.
>
> See this thread:
>
> http://mail-archives.apache.org/mod_mbox/cassandra-dev/201210.mbox/%3ccajhhpg20rrcajqjdnf8sf7wnhblo6j+aofksgbxyxwcoocg...@mail.gmail.com%3E
>
> I also wrote a couple blogs on it:
>
> http://brianoneill.blogspot.com/2012/09/composite-keys-connecting-dots-between.html
>
> http://brianoneill.blogspot.com/2012/10/cql-astyanax-and-compoundcomposite-keys.html
>
> They've fixed this in the 1.2 beta, whereby it checks (at the thrift
> layer) to ensure you have the requisite number of components in the
> compound/composite key.
>
> -brian
>
>
> On Oct 8, 2012, at 10:32 PM, Vivek Mishra wrote:
>
> Certainly. As these are available with cql3 only!
> Example mentioned on datastax website is working fine, only difference is
> i tried with a compound primary key with 3 composite columns in place of 2
>
> -Vivek
>
> On Tue, Oct 9, 2012 at 7:57 AM, Arindam Barua  wrote:
>
>>  ** **
>>
>> Did you use the “--cql3” option with the cqlsh command?
>>
>> ** **
>>
>> *From:* Vivek Mishra [mailto:mishra.v...@gmail.com]
>> *Sent:* Monday, October 08, 2012 7:22 PM
>> *To:* user@cassandra.apache.org
>>
>> *Subject:* Using compound primary key
>>
>> ** **
>>
>> Hi,
>>
>> ** **
>>
>> I am trying to use compound primary key column name and i am referring to:
>> 
>>
>> http://www.datastax.com/dev/blog/whats-new-in-cql-3-0
>>
>> ** **
>>
>> As mentioned on this example, i tried to create a column family
>> containing compound primary key (one or more) as:
>>
>> ** **
>>
>>  CREATE TABLE altercations (
>>
>>instigator text,
>>
>>started_at timestamp,
>>
>>ships_destroyed int,
>>
>>energy_used float,
>>
>>alliance_involvement boolean,
>>
>>PRIMARY KEY (instigator,started_at,ships_destroyed)
>>
>>);
>>
>> ** **
>>
>> And i am getting:
>>
>> ** **
>>
>> *
>> *
>>
>> TSocket read 0 bytes
>>
>> cqlsh:testcomp> 
>>
>> *
>> *
>>
>> ** **
>>
>> ** **
>>
>> Then followed by insert and select statements giving me following errors:
>> 
>>
>> ** **
>>
>>
>> 
>> 
>>
>> ** **
>>
>> cqlsh:testcomp>INSERT INTO altercations (instigator, started_at,
>> ships_destroyed,
>>
>> ...  energy_used,
>> alliance_involvement)
>>
>> ...  VALUES ('Jayne Cobb', '2012-07-23',
>> 2, 4.6, 'false');
>>
>> TSocket read 0 bytes
>>
>> ** **
>>
>> cqlsh:testcomp> select * from altercations;
>>
>> Traceback (most recent call last):
>>
>>   File "bin/cqlsh", line 1008, in perform_statement
>>
>> self.cursor.execute(statement, decoder=decoder)
>>
>>   File
>> "bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py", line
>> 117, in execute
>>
>> response = self.handle_cql_execution_errors(doquery, prepared_q,
>> compress)
>>
>>   File
>> "bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py", line
>> 132, in handle_cql_execution_errors
>>
>> return executor(*args, **kwargs)
>>
>>   File
>> "bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py",
>> line 1583, in execute_cql_query
>>
>> self.send_execute_cql_query(query, compression)
>>
>>   File
>> "bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py",
>> line 1593, in send_execute_cql_query
>>
>> self._oprot.trans.flush()
>>
>>   File
>> "bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TTransport.py",
>> line 293, in flush
>>
>> self.__trans.write(buf)
>>
>>   File
>> "bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TSocket.py",
>> line 117, in write
>>
>> plus = self.handle.send(buff)
>>
>> error: [Errno 32] Broken pipe
>>
>> ** **
>>
>> cqlsh:testcomp> 
>>
>> ** **
>>
>>
>> 
>> 
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>> Any idea?  Is it a problem with CQL3 or with cassandra?
>>
>> ** **
>>
>> P.S: I did post same query on dev group as well to get a quick response.*
>> ***
>>
>> ** **
>>
>> ** **
>>
>> -Vivek
>>
>
>
> --
> Brian ONeill
> Lead Architect, Health Market Sci

Re: Using compound primary key

2012-10-08 Thread Vivek Mishra
Ok. I am able to understand the problem now. Issue is:

If i create a column family altercations as:

**8
CREATE TABLE altercations (
   instigator text,
   started_at timestamp,
   ships_destroyed int,
   energy_used float,
   alliance_involvement boolean,
   PRIMARY KEY (instigator,started_at,ships_destroyed)
   );
/
   INSERT INTO altercations (instigator, started_at, ships_destroyed,
 energy_used, alliance_involvement)
 VALUES ('Jayne Cobb', '2012-07-23', 2, 4.6, 'false');
*

it works!

But if i create a column family with compound primary key with 2 composite
column as:

CREATE TABLE altercations (
   instigator text,
   started_at timestamp,
   ships_destroyed int,
   energy_used float,
   alliance_involvement boolean,
   PRIMARY KEY (instigator,started_at)
   );

*
and Then drop this column family:

drop columnfamily altercations;

and then try to create same one with primary compound key with 3 composite
column:

CREATE TABLE altercations (
   instigator text,
   started_at timestamp,
   ships_destroyed int,
   energy_used float,
   alliance_involvement boolean,
   PRIMARY KEY (instigator,started_at,ships_destroyed)
   );

it gives me error: "TSocket read 0 bytes"

Rest, as no column family is created, so nothing onwards will work.

Is this an issue?

-Vivek

On Tue, Oct 9, 2012 at 8:42 AM, Vivek Mishra  wrote:

> Hi Brian,
> Thanks for these references. These will surly help as i am on my way to
> get them integrate with-in Kundera.
>
> Surprisingly Column family itself was not created with example i was
> trying.
>
> Thanks again,
> -Vivek
>
> On Tue, Oct 9, 2012 at 8:33 AM, Brian O'Neill wrote:
>
>> Hey Vivek,
>>
>> The same thing happened to me the other day.  You may be missing a
>> component in your compound key.
>>
>> See this thread:
>>
>> http://mail-archives.apache.org/mod_mbox/cassandra-dev/201210.mbox/%3ccajhhpg20rrcajqjdnf8sf7wnhblo6j+aofksgbxyxwcoocg...@mail.gmail.com%3E
>>
>> I also wrote a couple blogs on it:
>>
>> http://brianoneill.blogspot.com/2012/09/composite-keys-connecting-dots-between.html
>>
>> http://brianoneill.blogspot.com/2012/10/cql-astyanax-and-compoundcomposite-keys.html
>>
>> They've fixed this in the 1.2 beta, whereby it checks (at the thrift
>> layer) to ensure you have the requisite number of components in the
>> compound/composite key.
>>
>> -brian
>>
>>
>> On Oct 8, 2012, at 10:32 PM, Vivek Mishra wrote:
>>
>> Certainly. As these are available with cql3 only!
>> Example mentioned on datastax website is working fine, only difference is
>> i tried with a compound primary key with 3 composite columns in place of 2
>>
>> -Vivek
>>
>> On Tue, Oct 9, 2012 at 7:57 AM, Arindam Barua  wrote:
>>
>>>  ** **
>>>
>>> Did you use the “--cql3” option with the cqlsh command?
>>>
>>> ** **
>>>
>>> *From:* Vivek Mishra [mailto:mishra.v...@gmail.com]
>>> *Sent:* Monday, October 08, 2012 7:22 PM
>>> *To:* user@cassandra.apache.org
>>>
>>> *Subject:* Using compound primary key
>>>
>>> ** **
>>>
>>> Hi,
>>>
>>> ** **
>>>
>>> I am trying to use compound primary key column name and i am referring
>>> to:
>>>
>>> http://www.datastax.com/dev/blog/whats-new-in-cql-3-0
>>>
>>> ** **
>>>
>>> As mentioned on this example, i tried to create a column family
>>> containing compound primary key (one or more) as:
>>>
>>> ** **
>>>
>>>  CREATE TABLE altercations (
>>>
>>>instigator text,
>>>
>>>started_at timestamp,
>>>
>>>ships_destroyed int,
>>>
>>>energy_used float,
>>>
>>>alliance_involvement boolean,
>>>
>>>PRIMARY KEY (instigator,started_at,ships_destroyed)
>>>
>>>);
>>>
>>> ** **
>>>
>>> And i am getting:
>>>
>>> ** **
>>>
>>> 
>>> **
>>>
>>> TSocket read 0 bytes
>>>
>>> cqlsh:testcomp> 
>>>
>>> 
>>> **
>>>
>>> ** **
>>>
>>> ** **
>>>
>>> Then followed by insert and select statements giving me following errors:
>>> 
>>>
>>> ** **
>>>
>>>
>>> 
>>> 
>>>
>>> ** **
>>>
>>> cqlsh:testcomp>INSERT INTO altercations (instigator, started_at,
>>> ships_destroyed,
>>>
>>> ...  energy_used,
>>> alliance_involvement)
>>>
>>> ...  VALUES ('Jayne Cobb', '2012-07-23',
>>>