Re: Strange delay in query

2012-11-11 Thread André Cruz
On Nov 11, 2012, at 12:01 AM, Binh Nguyen  wrote:

> FYI: Repair does not remove tombstones. To remove tombstones you need to run 
> compaction.
> If you have a lot of data then make sure you run compaction on all nodes 
> before running repair. We had a big trouble with our system regarding 
> tombstone and it took us long time to figure out the reason. It turned out 
> that repair process also transfers TTLed data (compaction is not triggered 
> yet) to the other nodes even that data was removed from the other nodes in 
> the compaction phase before that.
> 

Aren't compactions triggered automatically? At least minor compactions. Also, I 
read this in 
http://www.datastax.com/docs/1.1/operations/tuning#tuning-compaction :

"After running a major compaction, automatic minor compactions are no longer 
triggered, frequently requiring you to manually run major compactions on a 
routine basis."
"DataStax does not recommend major compaction."

So I'm unsure whether to start triggering manually these compactions… I guess 
I'll have to experiment with it.

Thanks!

André

Re: [BETA RELEASE] Apache Cassandra 1.2.0-beta2 released

2012-11-11 Thread Sylvain Lebresne
Actually, if we're going to be precise, it's -2^63 to 2^63 - 1.
Long.MIN_VALUE is not a valid token for technical reasons.

Do note that you can still not change partitioner, so the new partitioner
is for new cluster. But that does mean that for existing ones you'll have
to make sure you do keep RandomPartitioner in the yaml when you "merge" old
and new yaml. It's in the NEWS file however (that I do encourage you to
read, at least for new major versions).

--
Sylvain


On Sun, Nov 11, 2012 at 4:51 AM, Tyler Hobbs  wrote:

> The Murmur3Partitioner range is from -2^63 to 2^63 - 1 (Java's
> Long.MIN_VALUE to Long.MAX_VALUE).
>
>
> On Sat, Nov 10, 2012 at 8:59 PM, Brian O'Neill wrote:
>
>>
>> Wow...good catch.
>>
>> We had puppet scripts which automatically assigned the proper tokens
>> given the cluster size.
>> What is the range now?  Got a link?
>>
>> -brian
>>
>> On Nov 10, 2012, at 9:27 PM, Edward Capriolo wrote:
>>
>> just a note for all. The default partitioner is no longer
>> randompartitioner. It is now murmur, and the token range starts in negative
>> numbers. So you don't chose tokens Luke your father taught you anymore.
>>
>> On Friday, November 9, 2012, Sylvain Lebresne 
>> wrote:
>> > The Cassandra team is pleased to announce the release of the second
>> beta for
>> > the future Apache Cassandra 1.2.0.
>> > Let me first stress that this is beta software and as such is *not*
>> ready for
>> > production use.
>> > This release is still beta so is likely not bug free. However, lots
>> have been
>> > fixed since beta1 and if everything goes right, we are hopeful that a
>> first
>> > release candidate may follow shortly. Please do help testing this beta
>> to help
>> > make that happen. If you encounter any problem during your testing,
>> please
>> > report[3,4] them. And be sure to a look at the change log[1] and the
>> release
>> > notes[2] to see where Cassandra 1.2 differs from the previous series.
>> > Apache Cassandra 1.2.0-beta2[5] is available as usual from the cassandra
>> > website (http://cassandra.apache.org/download/) and a debian package is
>> > available using the 12x branch (see
>> http://wiki.apache.org/cassandra/DebianPackaging).
>> > Thank you for your help in testing and have fun with it.
>> > [1]: http://goo.gl/wnDAV (CHANGES.txt)
>> > [2]: http://goo.gl/CBsqs (NEWS.txt)
>> > [3]: https://issues.apache.org/jira/browse/CASSANDRA
>> > [4]: user@cassandra.apache.org
>> > [5]:
>> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/cassandra-1.2.0-beta2
>> >
>>
>>
>> --
>> Brian ONeill
>> Lead Architect, Health Market Science (http://healthmarketscience.com)
>> mobile:215.588.6024
>> blog: http://weblogs.java.net/blog/boneill42/
>> blog: http://brianoneill.blogspot.com/
>>
>>
>
>
> --
> Tyler Hobbs
> DataStax 
>
>


Re: leveled compaction and tombstoned data

2012-11-11 Thread Sylvain Lebresne
On Sat, Nov 10, 2012 at 7:17 PM, Edward Capriolo wrote:

> No it does not exist. Rob and I might start a donation page and give
> the money to whoever is willing to code it. If someone would write a
> tool that would split an sstable into 4 smaller sstables (even an
> offline command line tool)


Something like that:
https://github.com/pcmanus/cassandra/commits/sstable_split (adds an
sstablesplit offline tool)


> I would paypal them a hundo.
>

Just tell me how you want to proceed :)

--
Sylvain


>
> On Sat, Nov 10, 2012 at 1:10 PM, Aaron Turner 
> wrote:
> > Nope.  I think at least once a week I hear someone suggest one way to
> solve
> > their problem is to "write an sstablesplit tool".
> >
> > I'm pretty sure that:
> >
> > Step 1. Write sstablesplit
> > Step 2. ???
> > Step 3. Profit!
> >
> >
> >
> > On Sat, Nov 10, 2012 at 9:40 AM, Alain RODRIGUEZ 
> wrote:
> >>
> >> @Rob Coli
> >>
> >> Does the "sstablesplit" function exists somewhere ?
> >>
> >>
> >>
> >> 2012/11/10 Jim Cistaro 
> >>>
> >>> For some of our clusters, we have taken the periodic major compaction
> >>> route.
> >>>
> >>> There are a few things to consider:
> >>> 1) Once you start major compacting, depending on data size, you may be
> >>> committed to doing it periodically because you create one big file that
> >>> will take forever to naturally compact agaist 3 like sized files.
> >>> 2) If you rely heavily on file cache (rather than large row caches),
> each
> >>> major compaction effectively invalidates the entire file cache beause
> >>> everything is written to one new large file.
> >>>
> >>> --
> >>> Jim Cistaro
> >>>
> >>> On 11/9/12 11:27 AM, "Rob Coli"  wrote:
> >>>
> >>> >On Thu, Nov 8, 2012 at 10:12 AM, B. Todd Burruss 
> >>> > wrote:
> >>> >> my question is would leveled compaction help to get rid of the
> >>> >>tombstoned
> >>> >> data faster than size tiered, and therefore reduce the disk space
> >>> >> usage?
> >>> >
> >>> >You could also...
> >>> >
> >>> >1) run a major compaction
> >>> >2) code up sstablesplit
> >>> >3) profit!
> >>> >
> >>> >This method incurs a management penalty if not automated, but is
> >>> >otherwise the most efficient way to deal with tombstones and obsolete
> >>> >data.. :D
> >>> >
> >>> >=Rob
> >>> >
> >>> >--
> >>> >=Robert Coli
> >>> >AIM>ALK - rc...@palominodb.com
> >>> >YAHOO - rcoli.palominob
> >>> >SKYPE - rcoli_palominodb
> >>> >
> >>>
> >>
> >
> >
> >
> > --
> > Aaron Turner
> > http://synfin.net/ Twitter: @synfinatic
> > http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix &
> > Windows
> > Those who would give up essential Liberty, to purchase a little temporary
> > Safety, deserve neither Liberty nor Safety.
> > -- Benjamin Franklin
> > "carpe diem quam minimum credula postero"
> >
>


Re: CREATE COLUMNFAMILY

2012-11-11 Thread Edward Capriolo
If you supply metadata cassandra can use it for several things.

1) It validates data on insertion
2) Helps display the information in human readable formats in tools
like the CLI and
sstabletojson
3) If you add a built-in secondary index the type information is
needed, strings sort differently then integer
4) columns in rows are sorted by the column name, strings sort
differently then integers

On Sat, Nov 10, 2012 at 11:55 PM, Kevin Burton  wrote:
> I am sure this has been asked before but what is the purpose of entering
> key/value or more correctly key name/data type values on the CREATE
> COLUMNFAMILY command.
>
>
>
>


failed to create keyspace 1.2-beta2 bin protocol

2012-11-11 Thread Pierre Chalamet
Hi all,

 

I'm trying to create a keyspace through the CQL Binary Protocol but lamely
failed at doing this simple task with 1.2b2/cql3.

 

Various command tried:

 

DEBUG 18:08:23,240 Received: OPTIONS

DEBUG 18:08:23,240 Responding: SUPPORTED {CQL_VERSION=[3.0.0],
COMPRESSION=[snappy]}

DEBUG 18:08:23,254 Received: STARTUP {CQL_VERSION=3.0.0}

DEBUG 18:08:23,254 Responding: READY

DEBUG 18:08:23,269 Received: QUERY CREATE KEYSPACE Excelsior WITH
strategy_class='SimpleStrategy' AND strategy_options:replication_factor = 1

DEBUG 18:08:23,269 request complete

DEBUG 18:08:23,270 Responding: ERROR SYNTAX_ERROR: line 1:83 mismatched
input ':' expecting '='

DEBUG 18:08:23,303 Received: QUERY CREATE KEYSPACE Excelsior WITH
strategy_class='SimpleStrategy' AND strategy_options={replication_factor:1}

DEBUG 18:08:23,303 request complete

DEBUG 18:08:23,303 Responding: ERROR SYNTAX_ERROR: line 1:85 mismatched
input 'replication_factor' expecting set null

DEBUG 18:08:23,330 Received: QUERY CREATE KEYSPACE Excelsior WITH
placement_strategy='SimpleStrategy' AND strategy_options:replication_factor
= 1

DEBUG 18:08:23,330 request complete

DEBUG 18:08:23,330 Responding: ERROR SYNTAX_ERROR: line 1:87 mismatched
input ':' expecting '='

DEBUG 18:08:23,357 Received: QUERY CREATE KEYSPACE Excelsior WITH
placement_strategy='SimpleStrategy' AND
strategy_options={replication_factor:1}

DEBUG 18:08:23,357 request complete

DEBUG 18:08:23,357 Responding: ERROR SYNTAX_ERROR: line 1:89 mismatched
input 'replication_factor' expecting set null

 

But nothing seems to work as expected.

Could someone point me how to create a keyspace under CQL 3 using binary
protocol (don't know if binary protocol changes things on this anyway) ?

 

Thanks a lot,

- Pierre

 



In REAME.txt : CQL3 instead of CQL2 ?

2012-11-11 Thread Jean-Armel Luce
Hello,

I have installed the 1.2 beta2 (download source + compil)

The CREATE SCHEMA fails if I do as explained in README.txt :

bin/cqlsh --cql3<== --cql3 is the default in 1.2 so it is not needed
Connected to Test Cluster at localhost:9160.
[cqlsh 2.3.0 | Cassandra 1.2.0-beta2-SNAPSHOT | CQL spec 3.0.0 | Thrift
protocol 19.35.0]
Use HELP for help.
cqlsh> create keyspace jaltest WITH strategy_class = 'SimpleStrategy' AND
strategy_options:replication_factor='1';
Bad Request: line 1:82 mismatched input ':' expecting '='

If I give the CREATE SCHEMA in cql3, it works :-)
cqlsh> create keyspace jaltest with replication ={'class':
'SimpleStrategy', 'replication_factor': '1'};
cqlsh> describe keyspace jaltest;

CREATE KEYSPACE jaltest WITH replication = {
  'class': 'SimpleStrategy',
  'replication_factor': '1'
};

cqlsh>

It looks that the syntax of CREATE SCHEMA in the README is in CQL2, while
the syntax for connexion to cqlsh is for CQL3,
>From my point of view, it should be more friendly to write the CREATE
SCHEMA command using the CQL3 syntax rather than the CQL2 syntax in the
README.txt.

Best regards.

Jean Armel


RE: failed to create keyspace 1.2-beta2 bin protocol

2012-11-11 Thread Pierre Chalamet
Forget about it, should have read
https://github.com/apache/cassandra/blob/trunk/doc/cql3/CQL.textile instead
of source snapshot (not merged in 1.2-b2 snapshot ?)

 

- Pierre

 

From: Pierre Chalamet [mailto:pie...@chalamet.net] 
Sent: Sunday, November 11, 2012 6:11 PM
To: user@cassandra.apache.org
Subject: failed to create keyspace 1.2-beta2 bin protocol

 

Hi all,

 

I'm trying to create a keyspace through the CQL Binary Protocol but lamely
failed at doing this simple task with 1.2b2/cql3.

 

Various command tried:

 

DEBUG 18:08:23,240 Received: OPTIONS

DEBUG 18:08:23,240 Responding: SUPPORTED {CQL_VERSION=[3.0.0],
COMPRESSION=[snappy]}

DEBUG 18:08:23,254 Received: STARTUP {CQL_VERSION=3.0.0}

DEBUG 18:08:23,254 Responding: READY

DEBUG 18:08:23,269 Received: QUERY CREATE KEYSPACE Excelsior WITH
strategy_class='SimpleStrategy' AND strategy_options:replication_factor = 1

DEBUG 18:08:23,269 request complete

DEBUG 18:08:23,270 Responding: ERROR SYNTAX_ERROR: line 1:83 mismatched
input ':' expecting '='

DEBUG 18:08:23,303 Received: QUERY CREATE KEYSPACE Excelsior WITH
strategy_class='SimpleStrategy' AND strategy_options={replication_factor:1}

DEBUG 18:08:23,303 request complete

DEBUG 18:08:23,303 Responding: ERROR SYNTAX_ERROR: line 1:85 mismatched
input 'replication_factor' expecting set null

DEBUG 18:08:23,330 Received: QUERY CREATE KEYSPACE Excelsior WITH
placement_strategy='SimpleStrategy' AND strategy_options:replication_factor
= 1

DEBUG 18:08:23,330 request complete

DEBUG 18:08:23,330 Responding: ERROR SYNTAX_ERROR: line 1:87 mismatched
input ':' expecting '='

DEBUG 18:08:23,357 Received: QUERY CREATE KEYSPACE Excelsior WITH
placement_strategy='SimpleStrategy' AND
strategy_options={replication_factor:1}

DEBUG 18:08:23,357 request complete

DEBUG 18:08:23,357 Responding: ERROR SYNTAX_ERROR: line 1:89 mismatched
input 'replication_factor' expecting set null

 

But nothing seems to work as expected.

Could someone point me how to create a keyspace under CQL 3 using binary
protocol (don't know if binary protocol changes things on this anyway) ?

 

Thanks a lot,

- Pierre

 



Re: Read during digest mismatch

2012-11-11 Thread Jonathan Ellis
Correct.  Which is one reason there is a separate setting for
cross-datacenter read repair, by the way.

On Thu, Nov 8, 2012 at 4:43 PM, sankalp kohli  wrote:
> Hi,
> Lets say I am reading with consistency TWO and my replication is 3. The
> read is eligible for global read repair. It will send a request to get data
> from one node and a digest request to two.
> If there is a digest mismatch, what I am reading from the code looks like it
> will get the data from all three nodes and do a resolve of the data before
> returning to the client.
>
> Is it correct or I am readind the code wrong?
>
> Also if this is correct, look like if the third node is in other DC, the
> read will slow down even when the consistency was TWO?
>
> Thanks,
> Sankalp
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


RE: CREATE COLUMNFAMILY

2012-11-11 Thread Kevin Burton
Thank you this helps with my understanding. 

So the goal here is to supply as many name/type pairs as can be reasonably
be foreseen when the column family is created? Can the metadata be applied
after the fact? If so how?

-Original Message-
From: Edward Capriolo [mailto:edlinuxg...@gmail.com] 
Sent: Sunday, November 11, 2012 9:37 AM
To: user@cassandra.apache.org
Subject: Re: CREATE COLUMNFAMILY

If you supply metadata cassandra can use it for several things.

1) It validates data on insertion
2) Helps display the information in human readable formats in tools like the
CLI and sstabletojson
3) If you add a built-in secondary index the type information is needed,
strings sort differently then integer
4) columns in rows are sorted by the column name, strings sort differently
then integers

On Sat, Nov 10, 2012 at 11:55 PM, Kevin Burton 
wrote:
> I am sure this has been asked before but what is the purpose of 
> entering key/value or more correctly key name/data type values on the 
> CREATE COLUMNFAMILY command.
>
>
>
>



Re: Hinted Handoff runs every ten minutes

2012-11-11 Thread Jonathan Ellis
How many hint sstables are there?  What does sstable2json show?

On Thu, Nov 8, 2012 at 3:23 PM, Mike Heffner  wrote:
> Is there a ticket open for this for 1.1.6?
>
> We also noticed this after upgrading from 1.1.3 to 1.1.6. Every node runs a
> 0 row hinted handoff every 10 minutes. N-1 nodes hint to the same node,
> while that node hints to another node.
>
>
> On Tue, Oct 30, 2012 at 1:35 PM, Vegard Berget  wrote:
>>
>> Hi,
>>
>> I have the exact same problem with 1.1.6.  HintsColumnFamily consists of
>> one row (Rowkey 00, nothing more).   The "problem" started after upgrading
>> from 1.1.4 to 1.1.6.  Every ten minutes HintedHandoffManager starts and
>> finishes  after sending "0 rows".
>>
>> .vegard,
>>
>>
>>
>> - Original Message -
>> From:
>> user@cassandra.apache.org
>>
>> To:
>> 
>> Cc:
>>
>> Sent:
>> Mon, 29 Oct 2012 23:45:30 +0100
>>
>> Subject:
>> Re: Hinted Handoff runs every ten minutes
>>
>>
>> Dne 29.10.2012 23:24, Stephen Pierce napsal(a):
>> > I'm running 1.1.5; the bug says it's fixed in 1.0.9/1.1.0.
>> >
>> > How can I check to see why it keeps running HintedHandoff?
>> you have tombstone is system.HintsColumnFamily use list command in
>> cassandra-cli to check
>>
>
>
>
> --
>
>   Mike Heffner 
>   Librato, Inc.
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: leveled compaction and tombstoned data

2012-11-11 Thread Radim Kolar


I would be careful with the patch that was referred to above, it 
hasn't been reviewed, and from a glance it appears that it will cause 
an infinite compaction loop if you get more than 4 SSTables at max size.

it will, you need to setup max sstable size correctly.


RE: Connecting to cassandra.

2012-11-11 Thread Wz1975
For your testing,  I think put your machine's ip should work. 


Thanks.
-Wei

Sent from my Samsung smartphone on AT&T

 Original message 
Subject: RE: Connecting to cassandra. 
From: Kevin Burton  
To: user@cassandra.apache.org 
CC:  

Thank you in the output.log I see the line:

 

INFO 13:36:59,110 This node will not auto bootstrap because it is configured to 
be a seed node.

 

Apparently I changed too much in the cassandra.yaml file. What should the 
‘seed’ entry be? From the comments it is a comma separated list of IP 
addresses.  Should I just comment this entry out? The comment is made that 
0.0.0.0 is never correct.

 

# any class that implements the SeedProvider interface and has a

# constructor that takes a Map of parameters will do.

seed_provider:

    # Addresses of hosts that are deemed contact points.

# Cassandra nodes use this list of hosts to find each other and learn

    # the topology of the ring.  You must change this if you are running

    # multiple nodes!

    - class_name: org.apache.cassandra.locator.SimpleSeedProvider

  parameters:

  # seeds is actually a comma-delimited list of addresses.

  # Ex: ",,"

  - seeds: "172.16.35.108"

 

What should I put? Do you think this is the problem why it is not starting up?

 

 

From: Wz1975 [mailto:wz1...@yahoo.com] 
Sent: Saturday, November 10, 2012 9:07 PM
To: user@cassandra.apache.org
Subject: RE: Connecting to cassandra.
Importance: Low

 

The first thing to check is the log files under /var/log/cassandra,  should 
give you some hint. 


Thanks.
-Wei

Sent from my Samsung smartphone on AT&T 


 Original message 
Subject: Connecting to cassandra. 
From: Kevin Burton  
To: user@cassandra.apache.org 
CC: 


I have installed Cassandra on a Ubuntu Server but I fail to see it with either:

 

ps ax

 

or

 

netstat –an | grep 9160

 

I see a file /etc/init.d/cassandra so I am assuming that it should start up. 
What else do I need to do? I have edited cassandra.yaml for all the places that 
specifically specify localhost or 127.0.0.1 and change it to the IP address of 
the machine/server where it is running. I am assuming that I have hit all the 
right configuration points. Ideas?

 

Thank you.

 

Kevin

RE: Connecting to cassandra.

2012-11-11 Thread Kevin Burton
I finally got it to work by putting the putting “127.0.0.1” in the list of seed 
IPs. Any other address triggered the warning that Cassandra would not be 
auto-booted since it was listed as a seed machine and the init process never 
happened. I am not sure of the impact of setting it to this loopback address.

 

From: Wz1975 [mailto:wz1...@yahoo.com] 
Sent: Sunday, November 11, 2012 2:44 PM
To: user@cassandra.apache.org
Subject: RE: Connecting to cassandra.
Importance: Low

 

For your testing,  I think put your machine's ip should work. 


Thanks.
-Wei

Sent from my Samsung smartphone on AT&T 


 Original message 
Subject: RE: Connecting to cassandra. 
From: Kevin Burton  
To: user@cassandra.apache.org 
CC: 



Thank you in the output.log I see the line:

 

INFO 13:36:59,110 This node will not auto bootstrap because it is configured to 
be a seed node.

 

Apparently I changed too much in the cassandra.yaml file. What should the 
‘seed’ entry be? From the comments it is a comma separated list of IP 
addresses.  Should I just comment this entry out? The comment is made that 
0.0.0.0 is never correct. 

 

# any class that implements the SeedProvider interface and has a

# constructor that takes a Map of parameters will do.

seed_provider:

# Addresses of hosts that are deemed contact points. 

# Cassandra nodes use this list of hosts to find each other and learn

# the topology of the ring.  You must change this if you are running

# multiple nodes!

- class_name: org.apache.cassandra.locator.SimpleSeedProvider

  parameters:

  # seeds is actually a comma-delimited list of addresses.

  # Ex: ",,"

  - seeds: "172.16.35.108"

 

What should I put? Do you think this is the problem why it is not starting up?

 

 

From: Wz1975 [mailto:wz1...@yahoo.com] 
Sent: Saturday, November 10, 2012 9:07 PM
To: user@cassandra.apache.org
Subject: RE: Connecting to cassandra.
Importance: Low

 

The first thing to check is the log files under /var/log/cassandra,  should 
give you some hint. 


Thanks.
-Wei

Sent from my Samsung smartphone on AT&T 


 Original message 
Subject: Connecting to cassandra. 
From: Kevin Burton  
To: user@cassandra.apache.org 
CC: 




I have installed Cassandra on a Ubuntu Server but I fail to see it with either:

 

ps ax

 

or

 

netstat –an | grep 9160

 

I see a file /etc/init.d/cassandra so I am assuming that it should start up. 
What else do I need to do? I have edited cassandra.yaml for all the places that 
specifically specify localhost or 127.0.0.1 and change it to the IP address of 
the machine/server where it is running. I am assuming that I have hit all the 
right configuration points. Ideas?

 

Thank you.

 

Kevin



Re: Connecting to cassandra.

2012-11-11 Thread aaron morton
> warning that Cassandra would not be auto-booted since it was listed as a seed 
> machine and the init process never happened.
The warning is that it wont join the ring and try to request data from other 
servers in the cluster. 

If you are testing a single node it's not an issue. 

>  I am not sure of the impact of setting it to this loopback address.
Fine for testing purposes with one node. 
If you have more than you you will want to make it the same as the 
listen_address.

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 12/11/2012, at 10:29 AM, Kevin Burton  wrote:

> I finally got it to work by putting the putting “127.0.0.1” in the list of 
> seed IPs. Any other address triggered the warning that Cassandra would not be 
> auto-booted since it was listed as a seed machine and the init process never 
> happened. I am not sure of the impact of setting it to this loopback address.
>  
> From: Wz1975 [mailto:wz1...@yahoo.com] 
> Sent: Sunday, November 11, 2012 2:44 PM
> To: user@cassandra.apache.org
> Subject: RE: Connecting to cassandra.
> Importance: Low
>  
> For your testing,  I think put your machine's ip should work. 
> 
> 
> Thanks.
> -Wei
> 
> Sent from my Samsung smartphone on AT&T 
> 
> 
>  Original message 
> Subject: RE: Connecting to cassandra. 
> From: Kevin Burton  
> To: user@cassandra.apache.org 
> CC: 
> 
> 
> Thank you in the output.log I see the line:
>  
> INFO 13:36:59,110 This node will not auto bootstrap because it is configured 
> to be a seed node.
>  
> Apparently I changed too much in the cassandra.yaml file. What should the 
> ‘seed’ entry be? From the comments it is a comma separated list of IP 
> addresses.  Should I just comment this entry out? The comment is made that 
> 0.0.0.0 is never correct.
>  
> # any class that implements the SeedProvider interface and has a
> # constructor that takes a Map of parameters will do.
> seed_provider:
> # Addresses of hosts that are deemed contact points.
> # Cassandra nodes use this list of hosts to find each other and learn
> # the topology of the ring.  You must change this if you are running
> # multiple nodes!
> - class_name: org.apache.cassandra.locator.SimpleSeedProvider
>   parameters:
>   # seeds is actually a comma-delimited list of addresses.
>   # Ex: ",,"
>   - seeds: "172.16.35.108"
>  
> What should I put? Do you think this is the problem why it is not starting up?
>  
>  
> From: Wz1975 [mailto:wz1...@yahoo.com] 
> Sent: Saturday, November 10, 2012 9:07 PM
> To: user@cassandra.apache.org
> Subject: RE: Connecting to cassandra.
> Importance: Low
>  
> The first thing to check is the log files under /var/log/cassandra,  should 
> give you some hint. 
> 
> 
> Thanks.
> -Wei
> 
> Sent from my Samsung smartphone on AT&T 
> 
> 
>  Original message 
> Subject: Connecting to cassandra. 
> From: Kevin Burton  
> To: user@cassandra.apache.org 
> CC: 
> 
> 
> 
> I have installed Cassandra on a Ubuntu Server but I fail to see it with 
> either:
>  
> ps ax
>  
> or
>  
> netstat –an | grep 9160
>  
> I see a file /etc/init.d/cassandra so I am assuming that it should start up. 
> What else do I need to do? I have edited cassandra.yaml for all the places 
> that specifically specify localhost or 127.0.0.1 and change it to the IP 
> address of the machine/server where it is running. I am assuming that I have 
> hit all the right configuration points. Ideas?
>  
> Thank you.
>  
> Kevin



Re: CREATE COLUMNFAMILY

2012-11-11 Thread aaron morton
Also most idomatic clients use the information so they can return the 
appropriate type to you. 

>  Can the metadata be applied
> after the fact? If so how?
UPDATE COLUMN FAMILY in the CLI will let you change it. 
Note that we do not update the existing data. This can be a problem if you do 
something like change a variable length integer to a fixed length one. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 12/11/2012, at 8:06 AM, Kevin Burton  wrote:

> Thank you this helps with my understanding. 
> 
> So the goal here is to supply as many name/type pairs as can be reasonably
> be foreseen when the column family is created? Can the metadata be applied
> after the fact? If so how?
> 
> -Original Message-
> From: Edward Capriolo [mailto:edlinuxg...@gmail.com] 
> Sent: Sunday, November 11, 2012 9:37 AM
> To: user@cassandra.apache.org
> Subject: Re: CREATE COLUMNFAMILY
> 
> If you supply metadata cassandra can use it for several things.
> 
> 1) It validates data on insertion
> 2) Helps display the information in human readable formats in tools like the
> CLI and sstabletojson
> 3) If you add a built-in secondary index the type information is needed,
> strings sort differently then integer
> 4) columns in rows are sorted by the column name, strings sort differently
> then integers
> 
> On Sat, Nov 10, 2012 at 11:55 PM, Kevin Burton 
> wrote:
>> I am sure this has been asked before but what is the purpose of 
>> entering key/value or more correctly key name/data type values on the 
>> CREATE COLUMNFAMILY command.
>> 
>> 
>> 
>> 
> 



Re: Strange delay in query

2012-11-11 Thread aaron morton
If you have a long lived row with a lot of tombstones or overwrites, it's often 
more efficient to select a known list of columns. There are short circuits in 
the read path that can avoid older tombstones filled fragments of the row being 
read. (Obviously this is hard to do if you don't know the names of the columns).

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 11/11/2012, at 10:51 PM, André Cruz  wrote:

> On Nov 11, 2012, at 12:01 AM, Binh Nguyen  wrote:
> 
>> FYI: Repair does not remove tombstones. To remove tombstones you need to run 
>> compaction.
>> If you have a lot of data then make sure you run compaction on all nodes 
>> before running repair. We had a big trouble with our system regarding 
>> tombstone and it took us long time to figure out the reason. It turned out 
>> that repair process also transfers TTLed data (compaction is not triggered 
>> yet) to the other nodes even that data was removed from the other nodes in 
>> the compaction phase before that.
>> 
> 
> Aren't compactions triggered automatically? At least minor compactions. Also, 
> I read this in 
> http://www.datastax.com/docs/1.1/operations/tuning#tuning-compaction :
> 
> "After running a major compaction, automatic minor compactions are no longer 
> triggered, frequently requiring you to manually run major compactions on a 
> routine basis."
> "DataStax does not recommend major compaction."
> 
> So I'm unsure whether to start triggering manually these compactions… I guess 
> I'll have to experiment with it.
> 
> Thanks!
> 
> André



Re: HugeTLB (Hugepage) Support on a Cassandra Cluster

2012-11-11 Thread aaron morton
Past discussion here 
http://www.mail-archive.com/user@cassandra.apache.org/msg08786.html

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 11/11/2012, at 4:51 PM, Jason Wee  wrote:

> option -XX:+UseLargePages ?
> 
> 
> On Sat, Nov 10, 2012 at 2:34 AM, Morantus, James (PCLN-NW) 
>  wrote:
> Hi,
> 
>  
> 
> Does anyone know if DataStax/Cassandra recommends using HugeTLB on a cluster?
> 
>  
> 
> Thank you
> 
>  
> 
> James Morantus
> 
> Sr. Database Administrator
> 
> 203-299-8733
> 
> Priceline.com
> 
>  
> 
> 



RE: CREATE COLUMNFAMILY

2012-11-11 Thread Kevin Burton
What happens when you are mainly concerned about the human readable formats?
Say initially you don't supply metadata for a key like foo in the column
family, but you get tired of seeing binary data displayed for the values so
you update the column family to get a more human readable format by adding
metadata for foo. Will this work?

 

From: aaron morton [mailto:aa...@thelastpickle.com] 
Sent: Sunday, November 11, 2012 3:39 PM
To: user@cassandra.apache.org
Subject: Re: CREATE COLUMNFAMILY

 

Also most idomatic clients use the information so they can return the
appropriate type to you. 

 

 Can the metadata be applied
after the fact? If so how?

UPDATE COLUMN FAMILY in the CLI will let you change it. 

Note that we do not update the existing data. This can be a problem if you
do something like change a variable length integer to a fixed length one. 

 

Cheers

 

-

Aaron Morton

Freelance Developer

@aaronmorton

http://www.thelastpickle.com

 

On 12/11/2012, at 8:06 AM, Kevin Burton  wrote:





Thank you this helps with my understanding. 

So the goal here is to supply as many name/type pairs as can be reasonably
be foreseen when the column family is created? Can the metadata be applied
after the fact? If so how?

-Original Message-
From: Edward Capriolo [mailto:edlinuxg...@gmail.com] 
Sent: Sunday, November 11, 2012 9:37 AM
To: user@cassandra.apache.org
Subject: Re: CREATE COLUMNFAMILY

If you supply metadata cassandra can use it for several things.

1) It validates data on insertion
2) Helps display the information in human readable formats in tools like the
CLI and sstabletojson
3) If you add a built-in secondary index the type information is needed,
strings sort differently then integer
4) columns in rows are sorted by the column name, strings sort differently
then integers

On Sat, Nov 10, 2012 at 11:55 PM, Kevin Burton 
wrote:



I am sure this has been asked before but what is the purpose of 
entering key/value or more correctly key name/data type values on the 
CREATE COLUMNFAMILY command.





 

 



Re: CREATE COLUMNFAMILY

2012-11-11 Thread Jeremiah Jordan
That is fine.  You just have to be careful that you haven't already inserted 
data which would be rejected by the type you update to, as a client will have 
issues reading that data back.

-Jeremiah

On Nov 11, 2012, at 4:09 PM, Kevin Burton  wrote:

> What happens when you are mainly concerned about the human readable formats? 
> Say initially you don’t supply metadata for a key like foo in the column 
> family, but you get tired of seeing binary data displayed for the values so 
> you update the column family to get a more human readable format by adding 
> metadata for foo. Will this work?
>  
> From: aaron morton [mailto:aa...@thelastpickle.com] 
> Sent: Sunday, November 11, 2012 3:39 PM
> To: user@cassandra.apache.org
> Subject: Re: CREATE COLUMNFAMILY
>  
> Also most idomatic clients use the information so they can return the 
> appropriate type to you. 
>  
>  Can the metadata be applied
> after the fact? If so how?
> UPDATE COLUMN FAMILY in the CLI will let you change it. 
> Note that we do not update the existing data. This can be a problem if you do 
> something like change a variable length integer to a fixed length one. 
>  
> Cheers
>  
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>  
> On 12/11/2012, at 8:06 AM, Kevin Burton  wrote:
> 
> 
> Thank you this helps with my understanding. 
> 
> So the goal here is to supply as many name/type pairs as can be reasonably
> be foreseen when the column family is created? Can the metadata be applied
> after the fact? If so how?
> 
> -Original Message-
> From: Edward Capriolo [mailto:edlinuxg...@gmail.com] 
> Sent: Sunday, November 11, 2012 9:37 AM
> To: user@cassandra.apache.org
> Subject: Re: CREATE COLUMNFAMILY
> 
> If you supply metadata cassandra can use it for several things.
> 
> 1) It validates data on insertion
> 2) Helps display the information in human readable formats in tools like the
> CLI and sstabletojson
> 3) If you add a built-in secondary index the type information is needed,
> strings sort differently then integer
> 4) columns in rows are sorted by the column name, strings sort differently
> then integers
> 
> On Sat, Nov 10, 2012 at 11:55 PM, Kevin Burton 
> wrote:
> 
> I am sure this has been asked before but what is the purpose of 
> entering key/value or more correctly key name/data type values on the 
> CREATE COLUMNFAMILY command.
> 
> 
> 
> 
>  



Re: Multiple Clusters Keyspacse to one core cluster

2012-11-11 Thread B. Todd Burruss
with NetworkTopologyStrategy it theoretically should work

http://www.datastax.com/docs/1.0/cluster_architecture/replication


On Thu, Nov 8, 2012 at 5:11 PM, ws  wrote:
> If I have multiple clusters can I  replicate a keyspace from each of those
> cluster to separate cluster?
>
>


Re: removing SSTABLEs

2012-11-11 Thread Edward Capriolo
If you shutdown c* and remove an sstable (and it associated data,
index, bloom filter , and etc) files it is safe. I would delete any
saved caches as well.

It is safe in the sense that Cassandra will start up with no issues,
but you could be missing some data.

On Sun, Nov 11, 2012 at 11:09 PM, B. Todd Burruss  wrote:
> if i stop a node and remove an SSTABLE, let's call it X, is that safe?
>
> ok, more info.  i know that the data in SSTABLE X has been tombstoned
> but the tomstones are in SSTABLE Y.  i want to simply delete X and get
> rid of the data.
>
> how do i know this .. i did a major compaction a while back and the
> SSTABLE is so large it has not yet been compacted.  we "delete" data
> daily and only keep 7 days of data.  the SSTABLE is almost 30 days
> old.
>
> whattayathink?


Re: In REAME.txt : CQL3 instead of CQL2 ?

2012-11-11 Thread Sylvain Lebresne
I've updated the readme, thanks


On Sun, Nov 11, 2012 at 6:49 PM, Jean-Armel Luce  wrote:

> Hello,
>
> I have installed the 1.2 beta2 (download source + compil)
>
> The CREATE SCHEMA fails if I do as explained in README.txt :
>
> bin/cqlsh --cql3<== --cql3 is the default in 1.2 so it is not needed
> Connected to Test Cluster at localhost:9160.
> [cqlsh 2.3.0 | Cassandra 1.2.0-beta2-SNAPSHOT | CQL spec 3.0.0 | Thrift
> protocol 19.35.0]
> Use HELP for help.
> cqlsh> create keyspace jaltest WITH strategy_class = 'SimpleStrategy' AND
> strategy_options:replication_factor='1';
> Bad Request: line 1:82 mismatched input ':' expecting '='
>
> If I give the CREATE SCHEMA in cql3, it works :-)
> cqlsh> create keyspace jaltest with replication ={'class':
> 'SimpleStrategy', 'replication_factor': '1'};
> cqlsh> describe keyspace jaltest;
>
> CREATE KEYSPACE jaltest WITH replication = {
>   'class': 'SimpleStrategy',
>   'replication_factor': '1'
> };
>
> cqlsh>
>
> It looks that the syntax of CREATE SCHEMA in the README is in CQL2, while
> the syntax for connexion to cqlsh is for CQL3,
> From my point of view, it should be more friendly to write the CREATE
> SCHEMA command using the CQL3 syntax rather than the CQL2 syntax in the
> README.txt.
>
> Best regards.
>
> Jean Armel
>


CF metadata syntax for an array

2012-11-11 Thread Kevin Burton
I am sorry if this is an FAQ. But I was wondering what the syntax for
describing an array? I have gotten as far as feeling a need to understand a
'super-column' but I fail after that. Once I have the metadata in place to
describe an array how do I  insert data into the array? Get data from the
array? Thank you.

 



java.io.IOException: InvalidRequestException(why:Expected 8 or 0 byte long for date (4)) when inserting data to CF with compound key from pig

2012-11-11 Thread Шамим
Hello All,
  we are using pig (pig-0.10.0) to store some data in CF with compound key. 
Cassandra version is 1.1.15. Here is the script for creating CF
CREATE TABLE clicks_c (
  user_id varchar,
  time timestamp,
  url varchar,
  PRIMARY KEY (user_id, time)
) WITH COMPACT STORAGE;

Here is description of the keyspace with CF
Keyspace: test:
  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
  Durable Writes: true
Options: [p00smevDC:1, p00skimDC:1]
  Column Families:
ColumnFamily: clicks_c
  Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
  Default column value validator: org.apache.cassandra.db.marshal.UTF8Type
  Columns sorted by: org.apache.cassandra.db.marshal.DateType
  GC grace seconds: 864000
  Compaction min/max thresholds: 4/32
  Read repair chance: 0.1
  DC Local Read repair chance: 0.0
  Replicate on write: true
  Caching: KEYS_ONLY
  Bloom Filter FP chance: default
  Built indexes: []
  Compaction Strategy: 
org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
  Compression Options:
sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor

Using following pig script to store data in CF named clicks_c

REGISTER /oracle/smev-pig-scripts/lib/piggybank.jar;
REGISTER /oracle/smev-pig-scripts/lib/joda-time-2.1.jar;

DEFINE ISOToUnix 
org.apache.pig.piggybank.evaluation.datetime.convert.ISOToUnix();

rows = LOAD 'cassandra://test/clicks' USING CassandraStorage();
--dump rows;
clup = foreach rows generate TOTUPLE('user_id', key), TOTUPLE('time', 
ISOToUnix('2009-01-07T01:07:01.000Z')), url;
store clup into 'cassandra://test/clicks_c' using CassandraStorage();

manaualy i can insert data into the CF through CQL 3.0. What i am doing wrong 
here? Or CassandraStorage.java for stroring is not supporting compound key 
features yet? 
thank's in advance
Shamim