not enough bytes to read value of component 0

2013-10-03 Thread Anseh Danesh
Hi all.. I have a column Family in cassandra and I want to use it in a mapreduce program.. my column family contans: stationinfo(stationid int, cityid int, name varchar, location varchar, supervisor varchar, provinceid int, code int, date varchar, time varchar, temprature float, humidity float, pre

Re: What is the best way to install & upgrade Cassandra on Ubuntu ?

2013-10-03 Thread Romain HARDOUIN
> Don't know why it doesn't include just the preferred Oracle JRE Probably due to licence concerns. Romain

Re: What is the best way to install & upgrade Cassandra on Ubuntu ?

2013-10-03 Thread Ertio Lew
Thanks for clarifications! Btw DSC installs OpenJDK when java is not present on your system. Don't know why it doesn't include just the preferred Oracle JRE installation & take care of later updates to that as well, so that could be a reason to choose DSC over official apache Debian(as that would b

Re: DataStax driver with Scala/Akka

2013-10-03 Thread Richard Rodseth
Thanks very much. Your pom works for me too, so that gives me a good reference point. On Thu, Oct 3, 2013 at 11:30 AM, Giancarlo Silvestrin wrote: > I created a sample pom.xml that successfully compiles cassandra.scala > using maven 3, it might be useful to compare with you own: > https://gist.g

Re: Minimum row size / minimum data point size

2013-10-03 Thread Andrey Ilinykh
It may help. https://docs.google.com/spreadsheet/ccc?key=0Atatq_AL3AJwdElwYVhTRk9KZF9WVmtDTDVhY0xPSmc#gid=0 On Thu, Oct 3, 2013 at 1:31 PM, Robert Važan wrote: > I need to store one trillion data points. The data is highly compressible > down to 1 byte per data point using simple custom compres

Re: What is the best way to install & upgrade Cassandra on Ubuntu ?

2013-10-03 Thread Daniel Chia
Opscenter is a separate package: http://www.datastax.com/documentation/opscenter/3.2/webhelp/index.html?pagename=docs&version=opscenter&file=index#opsc/install/opscInstallDeb_t.html Thanks, Daniel On Tue, Oct 1, 2013 at 8:11 PM, Aaron Morton wrote: > Does DSC include other things like Opscenter

Minimum row size / minimum data point size

2013-10-03 Thread Robert Važan
I need to store one trillion data points. The data is highly compressible down to 1 byte per data point using simple custom compression combined with standard dictionary compression. What's the most space-efficient way to store the data in Cassandra? How much per-row overhead is there if I store on

Re: DataStax driver with Scala/Akka

2013-10-03 Thread Giancarlo Silvestrin
I created a sample pom.xml that successfully compiles cassandra.scala using maven 3, it might be useful to compare with you own: https://gist.github.com/gsilvestrin/6814624 On Thu, Oct 3, 2013 at 2:16 PM, Richard Rodseth wrote: > Thanks for the offer. I wouldn't be able to share the whole pom,

Re: DataStax driver with Scala/Akka

2013-10-03 Thread Richard Rodseth
Thanks for the offer. I wouldn't be able to share the whole pom, and this task has been de-prioritized, but if I can find the time I will try to create a simpler test case. I just tried adding the exclusion to the pom dependency, but it didn't make a difference. sbt: "com.datastax.cassandra"

Re: Best version to upgrade from 1.1.10 to 1.2.X

2013-10-03 Thread Paulo Motta
This is the log after enabling TRACE on org.apache.cassandra.net.OutboundTcpConnection: DEBUG [WRITE-/54.215.70.YY] 2013-10-03 18:01:50,237 OutboundTcpConnection.java (line 338) Target max version is -2147483648; no version information yet, will retry TRACE [HANDSHAKE-/10.177.14.XX] 2013-10-03 18:

Re: Best version to upgrade from 1.1.10 to 1.2.X

2013-10-03 Thread Paulo Motta
Hello, During a rolling upgrade between 1.1.10 and 1.2.10, the newly upgrade nodes keep showing the following log message: INFO [HANDSHAKE-/10.176.249.XX] 2013-10-03 17:36:16,948 OutboundTcpConnection.java (line 399) Handshaking version with /10.176.249.XX INFO [HANDSHAKE-/10.176.182.YY] 2013-1

Re: DataStax driver with Scala/Akka

2013-10-03 Thread Giancarlo Silvestrin
Richard, I'm using akka + cassandra as well. I copied cassandra.scala to my local project and it compiled fine using SBT. If you can share your pom.xml I can try to help you. -- Giancarlo On Thu, Oct 3, 2013 at 12:14 PM, Richard Rodseth wrote: > I wanted to try the async Cassandra driver from

Re: PendingTasks: What does it mean inside Cassandra?

2013-10-03 Thread Robert Coli
On Thu, Oct 3, 2013 at 8:28 AM, Tyler Hobbs wrote: > > On Thu, Oct 3, 2013 at 8:37 AM, Girish Kumar wrote: > >> Additional capacity meaning add more writers and reader threads to handle >> the requests ? > > > I think he means adding more nodes. > The Doc I am quoting almost certainly means add

cassandra mapreduce column is nul

2013-10-03 Thread Anseh Danesh
Hi all.. I am pretty new to cassandra. I write a mapreduce program that read data from my cassandra columnfamily. My column value is date type but I import it in cassandra as varchar. when I specify my date column (or any other column) as source column to read the dates as map function and assign i

DataStax driver with Scala/Akka

2013-10-03 Thread Richard Rodseth
I wanted to try the async Cassandra driver from DataStax, in a Scala/Akka app, so I took a look at the Akka Cassandra Activator template. https://github.com/eigengo/activator-akka-cassandra I copied cassandra.scala from the template (it contains a conversion from ResultSetFuture to a Scala future

Re: PendingTasks: What does it mean inside Cassandra?

2013-10-03 Thread Tyler Hobbs
On Thu, Oct 3, 2013 at 8:37 AM, Girish Kumar wrote: > Additional capacity meaning add more writers and reader threads to handle > the requests ? I think he means adding more nodes. -- Tyler Hobbs DataStax

Re: Unable to bootstrap new node

2013-10-03 Thread Keith Wright
Thanks for the response. We are still having issues bootstrapping a node. Quick background on where we are at (1.2.8 with Vnodes): * We had a node start to complain about corrupted SSTables which we tried to delete one by one but it quickly became a whack-a-mole problem so we decided we w

Re: PendingTasks: What does it mean inside Cassandra?

2013-10-03 Thread Girish Kumar
>>Watching trends on these pools for increases in the pending tasks >>column is an excellent indicator of the need to add additional capacity. Additional capacity meaning add more writers and reader threads to handle the requests ? On Tue, Oct 1, 2013 at 12:22 PM, Robert Coli wrote: > On Wed,

Re: Cassandra Heap Size for data more than 1 TB

2013-10-03 Thread Michał Michalski
Currently we have 480-520 GB of data per node, so it's not even close to 1TB, but I'd bet that reaching 700-800GB shouldn't be a problem in terms of "everyday performance" - heap space is quite low, no GC issues etc. (to give you a comparison: when working on 1.1 and having ~300-400GB per node

Re: Cassandra Heap Size for data more than 1 TB

2013-10-03 Thread srmore
Thanks Mohit and Michael, That's what I thought. I have tried all the avenues, will give ParNew a try. With the 1.0.xx I have issues when data sizes go up, hopefully that will not be the case with 1.2. Just curious, has anyone tried 1.2 with large data set, around 1 TB ? Thanks ! On Thu, Oct 3

Re: Cassandra Heap Size for data more than 1 TB

2013-10-03 Thread Michał Michalski
I was experimenting with 128 vs. 512 some time ago and I was unable to see any difference in terms of performance. I'd probably check 1024 too, but we migrated to 1.2 and heap space was not an issue anymore. M. W dniu 02.10.2013 16:32, srmore pisze: I changed my index_interval from 128 to ind