CQL and reserved keywords

2014-03-04 Thread Andy
Hi,

while adding support for Cassandra into DataNucleus I came across the 
following
CREATE TABLE schema1.mapfkvalueitem (desc varchar,key varchar,name 
varchar,mapfkvalueitem_id bigint, PRIMARY KEY (mapfkvalueitem_id));

which fails with the delightfully informative message
Bad Request: line 1:37 no viable alternative at input 'desc'


Simple trial and error leads me to the conclusion that this is due to "desc" 
being a reserved keyword that cannot be used for column identifiers.

1. Is there a list somewhere of reserved keywords that I can access so that I 
prohibit a user entering such a column name? a method on the DataStax Java 
driver would be great, but otherwise a web page somewhere would do. 

2. Wouldn't it be better to have that error message updated to be a bit more 
descriptive such as "'desc' is a reserved keyword, so cannot be used as a 
table/column identifier" or even just "'desc' is a reserved keyword" ?


TIA
-- 
Andy


RE: Gossip intermittently marks node as DOWN

2014-03-04 Thread Romain HARDOUIN
Set phi_convict_threshold to 12 is a good idea if your network is busy. 
Are your VMs located in different datacenters?
Did you check if the nodes are not overloaded? An unresponsive node can be 
seen as down even if it's temporary.

Romain

Phil Luckhurst  a écrit sur 03/03/2014 
15:16:25 :

> De : Phil Luckhurst 
> A : cassandra-u...@incubator.apache.org, 
> Date : 03/03/2014 15:17
> Objet : Gossip intermittently marks node as DOWN
> 
> We have a 2 node Cassandra 2.0.5 cluster running on a couple of VMWare 
hosted
> virtual machines using Ubuntu 12.04 for testing. As you can see from the 
log
> entries below the gossip connection between the nodes regularly goes 
DOWN
> and UP. We saw on another post that increasing the phi_convict_threshold 
may
> help with this so we increased that to '12' but we still get the same
> problem. 
> 
>  INFO [GossipTasks:1] 2014-02-28 07:51:10,937 Gossiper.java (line 863)
> InetAddress /10.150.100.20 is now DOWN 
>  INFO [HANDSHAKE-/10.150.100.20] 2014-02-28 07:51:10,951
> OutboundTcpConnection.java (line 386) Handshaking version with
> /10.150.100.20 
>  INFO [RequestResponseStage:898] 2014-02-28 07:51:21,411 Gossiper.java 
(line
> 849) InetAddress /10.150.100.20 is now UP 
>  INFO [HANDSHAKE-/10.150.100.20] 2014-02-28 07:53:52,100
> OutboundTcpConnection.java (line 386) Handshaking version with
> /10.150.100.20 
>  INFO [GossipTasks:1] 2014-02-28 08:06:52,956 Gossiper.java (line 863)
> InetAddress /10.150.100.20 is now DOWN 
>  INFO [HANDSHAKE-/10.150.100.20] 2014-02-28 08:06:52,963
> OutboundTcpConnection.java (line 386) Handshaking version with
> /10.150.100.20 
>  INFO [RequestResponseStage:915] 2014-02-28 08:07:21,447 Gossiper.java 
(line
> 849) InetAddress /10.150.100.20 is now UP 
>  INFO [HANDSHAKE-/10.150.100.20] 2014-02-28 08:14:09,613
> OutboundTcpConnection.java (line 386) Handshaking version with
> /10.150.100.20 
> 
>  Has anyone got any suggestions for fixing this? 
> 
>  Thanks 
>  Phil


Re: CQL and reserved keywords

2014-03-04 Thread DuyHai Doan
Hello Andy

1. "Is there a list somewhere of reserved keywords that I can access so
that I
prohibit a user entering such a column name?" ->
http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/keywords_r.html

2. you may propose a pull request ?




On Tue, Mar 4, 2014 at 10:52 AM, Andy  wrote:

> Hi,
>
> while adding support for Cassandra into DataNucleus I came across the
> following
> CREATE TABLE schema1.mapfkvalueitem (desc varchar,key varchar,name
> varchar,mapfkvalueitem_id bigint, PRIMARY KEY (mapfkvalueitem_id));
>
> which fails with the delightfully informative message
> Bad Request: line 1:37 no viable alternative at input 'desc'
>
>
> Simple trial and error leads me to the conclusion that this is due to
> "desc"
> being a reserved keyword that cannot be used for column identifiers.
>
> 1. Is there a list somewhere of reserved keywords that I can access so
> that I
> prohibit a user entering such a column name? a method on the DataStax Java
> driver would be great, but otherwise a web page somewhere would do.
>
> 2. Wouldn't it be better to have that error message updated to be a bit
> more
> descriptive such as "'desc' is a reserved keyword, so cannot be used as a
> table/column identifier" or even just "'desc' is a reserved keyword" ?
>
>
> TIA
> --
> Andy
>


RE: Gossip intermittently marks node as DOWN

2014-03-04 Thread Phil Luckhurst
The VMs are hosted on the same ESXi server and they are just running
Cassandra. We seem to get this happen even if the nodes appear to be idle;
about 2 to 4 times per hour.


Phil



--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Gossip-intermittently-marks-node-as-DOWN-tp7593189p7593199.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


How to paginate all columns in a row

2014-03-04 Thread Lu, Boying
Hi, All,

I need to paginate all columns in a row. but can not  to use 
RowQuery.autoPaginate() method to do this
due to some requirements.

All columns are composite column and I need to get next 'page' by a separated 
query.

Here is my pseudo codes (page size =3):
ColumList  result = query.execute().getResult()  //get first 3 columns  A

//last column of this query
Column lastColumn = result.getColumnByIndex(result.size()-1);

//not try to get next page
RowQuery query = keyspace.prepareQuery(someCf).getKey(someRow)
   
.withColumnRange(AnnotatedCompositeSerializer.buildRange()

 .greaterThan(lastColumn.getName().toString());

 .limit(3));  // page size =3

result = query.execute().getResult();  //B


I created 10 columns in the CF, I can get the first 3 columns after A
But get zero columns at B, which is unexpected.

I've verified that there are 10 columns in the CF.  So there must be something 
wrong with my second query.

Can anyone tell me what's wrong with the second query?

Thanks a lot

Boying




Re: Issue with cassandra-pig-thrift.transport and java.net.SocketException: Connection reset

2014-03-04 Thread Miguel Angel Martin junquera
hi all,

 After trying  some things like:

   -  tunning hadoop,
   -  test with Cassandra 2.0.5
   -  test with cassandra 2.1 beta
   -  increase memory in cassandra and  hadoop,
   -  upgrade and change  instances in EC2
   -  increase number of threads
   -  change type  rpc server  in Cassandra from sync to hsha, but  if we
   change this server we can not  work with a Astianax and Hadoop jobs run too
   slowly
   - etc ...

We can run this  pig script succesfully  with Cassandra 1.2.15 and it works
fine, so we downgrade Cassandra to this version by now.


Regards




2014-02-27 17:29 GMT+01:00 Miguel Angel Martin junquera <
mianmarjun.mailingl...@gmail.com>:

> HI all,
>
> I trying to do a cogroup with five relations that I load from cassandra
> previously.
>
> In single node and local casandra testing environment the script works
> fine but when I try to execute in a cluster over AWS instances with only
> one slave  in hadoop cluster and One seed cassandra node I have a timeout
>  with a thirf socket.
>
> Are there a param in  to increase this time or how can I fix this issue?
>
>
> Thanks in advance
>
>
> =
>
> this is the log:
> ==
>
> 2014-02-27 16:17:13,653 [Thread-9] ERROR
> org.apache.hadoop.security.UserGroupInformation -
> PriviledgedActionException as:ec2-user
> cause:org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
> org.apache.thrift.transport.TTransportException: java.net.SocketException:
> Connection reset
>
> 2014-02-27 16:17:13,654 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - job null has failed! Stop running all dependent jobs
>
> 2014-02-27 16:17:13,654 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - job null has failed! Stop running all dependent jobs
>
> 2014-02-27 16:17:13,658 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 100% complete
>
> 2014-02-27 16:17:13,668 [main] ERROR
> org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to
> recreate exception from backend error:
> org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Could not
> get input splits
>
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:285)
>
> at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054)
>
> at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071)
>
> at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
>
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)
>
> at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
>
> at
> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
>
> at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
>
> at java.lang.Thread.run(Thread.java:744)
>
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:260)
>
> Caused by: java.io.IOException: Could not get input splits
>
> at
> org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:197)
>
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:273)
>
> ... 15 more
>
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.RuntimeException:
> org.apache.thrift.transport.TTransportException: java.net.SocketException:
> Connection reset
>
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>
> at java.util.concurrent.FutureTask.get(FutureTask.java:188)
>
> at
> org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:193)
>
> ... 16 more
>
> Caused by: java.lang.RuntimeException:
> org.apache.thrift.transport.TTransportException: java.net.SocketException:
> Connection reset
>
> at
> org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:308)
>
> at
> org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61)
>
> at
> org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:230)
>
> at
> org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:215)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>
> at
> java.ut

Re: Gossip intermittently marks node as DOWN

2014-03-04 Thread Fabrice Facorat
>From what I understand, this can happen when having many nodes and
vnodes by node. How many vnodes did you configure on your nodes ?

2014-03-04 11:37 GMT+01:00 Phil Luckhurst :
> The VMs are hosted on the same ESXi server and they are just running
> Cassandra. We seem to get this happen even if the nodes appear to be idle;
> about 2 to 4 times per hour.
>
> 
> Phil
>
>
>
> --
> View this message in context: 
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Gossip-intermittently-marks-node-as-DOWN-tp7593189p7593199.html
> Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
> Nabble.com.



-- 
Close the World, Open the Net
http://www.linux-wizard.net


Cassandra cpp driver call to local cassandra colo

2014-03-04 Thread Check Peck
I have couple of question on Datastax C++ driver.

We have 36 nodes Cassandra cluster. 12 nodes in DC1, 12 nodes in DC2, 12
nodes in DC3 datacenters.

And our application code is also in three datacenters- 11 node in DC1, 11
node in DC2, 11 node in DC3 datacenter.

So my question is if the application call is coming from DC1 datacenter,
then will it go to DC1 Cassandra nodes automatically with the use of cpp
driver? And same with DC2 and DC3?

Or we need to add some config changes in our C++ code while making
connection to cassandra which will then make sure if the call is coming
from DC1 datacenter then it will go to DC1 Cassandra nodes?

If there is any config change which we need to add in our C++ code, then
can you please point me to that?


Re: Gossip intermittently marks node as DOWN

2014-03-04 Thread Phil Luckhurst
It was created with the default settings so we have 256 per node.


Fabrice Facorat wrote
> From what I understand, this can happen when having many nodes and
> vnodes by node. How many vnodes did you configure on your nodes ?
> 
> 2014-03-04 11:37 GMT+01:00 Phil Luckhurst <

> phil.luckhurst@

> >:
>> The VMs are hosted on the same ESXi server and they are just running
>> Cassandra. We seem to get this happen even if the nodes appear to be
>> idle;
>> about 2 to 4 times per hour.
>>
>> 
>> Phil
>>
>>
>>
>> --
>> View this message in context:
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Gossip-intermittently-marks-node-as-DOWN-tp7593189p7593199.html
>> Sent from the 

> cassandra-user@.apache

>  mailing list archive at Nabble.com.
> 
> 
> 
> -- 
> Close the World, Open the Net
> http://www.linux-wizard.net





--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Gossip-intermittently-marks-node-as-DOWN-tp7593189p7593204.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: Gossip intermittently marks node as DOWN

2014-03-04 Thread Johnny Miller
What is nodetool tpstats telling you?

On 4 Mar 2014, at 15:10, Phil Luckhurst  wrote:

> It was created with the default settings so we have 256 per node.
> 
> 
> Fabrice Facorat wrote
>> From what I understand, this can happen when having many nodes and
>> vnodes by node. How many vnodes did you configure on your nodes ?
>> 
>> 2014-03-04 11:37 GMT+01:00 Phil Luckhurst <
> 
>> phil.luckhurst@
> 
>> >:
>>> The VMs are hosted on the same ESXi server and they are just running
>>> Cassandra. We seem to get this happen even if the nodes appear to be
>>> idle;
>>> about 2 to 4 times per hour.
>>> 
>>> 
>>> Phil
>>> 
>>> 
>>> 
>>> --
>>> View this message in context:
>>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Gossip-intermittently-marks-node-as-DOWN-tp7593189p7593199.html
>>> Sent from the 
> 
>> cassandra-user@.apache
> 
>> mailing list archive at Nabble.com.
>> 
>> 
>> 
>> -- 
>> Close the World, Open the Net
>> http://www.linux-wizard.net
> 
> 
> 
> 
> 
> --
> View this message in context: 
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Gossip-intermittently-marks-node-as-DOWN-tp7593189p7593204.html
> Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
> Nabble.com.



Re: Gossip intermittently marks node as DOWN

2014-03-04 Thread Phil Luckhurst
Here's the tpstats output from both nodes.






Johnny Miller wrote
> What is nodetool tpstats telling you?





--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Gossip-intermittently-marks-node-as-DOWN-tp7593189p7593206.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: Gossip intermittently marks node as DOWN

2014-03-04 Thread Johnny Miller
That looks healthy - nothing blocked or dropped.



On 4 Mar 2014, at 16:12, Phil Luckhurst  wrote:

> Here's the tpstats output from both nodes.
> 
> 
> 
> 
> 
> 
> Johnny Miller wrote
>> What is nodetool tpstats telling you?
> 
> 
> 
> 
> 
> --
> View this message in context: 
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Gossip-intermittently-marks-node-as-DOWN-tp7593189p7593206.html
> Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
> Nabble.com.



Re: Cassandra cpp driver call to local cassandra colo

2014-03-04 Thread Manoj Khangaonkar
Hi ,

Your client/application will connect to one of the nodes from the nodes you
tell it to connect. In the java driver this is done by calling
Cluster.builder.addContactPoint(...). I suppose the C++ driver will have
similar class method. For the app in DC1 provide only nodes in DC1 as
contact points.

regards


On Tue, Mar 4, 2014 at 6:47 AM, Check Peck  wrote:

> I have couple of question on Datastax C++ driver.
>
> We have 36 nodes Cassandra cluster. 12 nodes in DC1, 12 nodes in DC2, 12
> nodes in DC3 datacenters.
>
> And our application code is also in three datacenters- 11 node in DC1, 11
> node in DC2, 11 node in DC3 datacenter.
>
> So my question is if the application call is coming from DC1 datacenter,
> then will it go to DC1 Cassandra nodes automatically with the use of cpp
> driver? And same with DC2 and DC3?
>
> Or we need to add some config changes in our C++ code while making
> connection to cassandra which will then make sure if the call is coming
> from DC1 datacenter then it will go to DC1 Cassandra nodes?
>
> If there is any config change which we need to add in our C++ code, then
> can you please point me to that?
>
>


-- 
http://khangaonkar.blogspot.com/


Question regarding java DowngradingConsistencyRetryPolicy

2014-03-04 Thread HAITHEM JARRAYA
Hi All,

I might be missing something and I would like some clarification on this. We 
are using the java driver with the Downgrading Retry policy, we see in our logs 
that are only the reads are retried.

In the code and the docs, it says that the write method will retry a maximum of 
one retry, when the WriteType is UNLOGGED_BATCH or BATCH_LOG.
My question is, when a write is considered as SIMPLE?

Thanks,

Haithem

/**
 * Defines whether to retry and at which consistency level on a write 
timeout.
 * 
 * This method triggers a maximum of one retry. If {@code writeType ==
 * WriteType.BATCH_LOG}, the write is retried with the initial
 * consistency level. If {@code writeType == WriteType.UNLOGGED_BATCH}
 * and at least one replica acknowledged, the write is retried with a
 * lower consistency level (with unlogged batch, a write timeout can
 * always mean that part of the batch haven't been persisted at
 * all, even if {@code receivedAcks > 0}). For other {@code writeType},
 * if we know the write has been persisted on at least one replica, we
 * ignore the exception. Otherwise, an exception is thrown.
 *
 * @param statement the original query that timed out.
 * @param cl the original consistency level of the write that timed out.
 * @param writeType the type of the write that timed out.
 * @param requiredAcks the number of acknowledgments that were required to
 * achieve the requested consistency level.
 * @param receivedAcks the number of acknowledgments that had been received
 * by the time the timeout exception was raised.
 * @param nbRetry the number of retry already performed for this operation.
 * @return a RetryDecision as defined above.
 */
@Override
public RetryDecision onWriteTimeout(Statement statement, ConsistencyLevel 
cl, WriteType writeType, int requiredAcks, int receivedAcks, int nbRetry) {
if (nbRetry != 0)
return RetryDecision.rethrow();

switch (writeType) {
case SIMPLE:
case BATCH:
// Since we provide atomicity there is no point in retrying
return RetryDecision.ignore();
case UNLOGGED_BATCH:
// Since only part of the batch could have been persisted,
// retry with whatever consistency should allow to persist all
return maxLikelyToWorkCL(receivedAcks);
case BATCH_LOG:
return RetryDecision.retry(cl);
}
// We want to rethrow on COUNTER and CAS, because in those case "we 
don't know" and don't want to guess
return RetryDecision.rethrow();
}




Re: Cassandra cpp driver call to local cassandra colo

2014-03-04 Thread Check Peck
I guess you are not right.. Cluster.builder.addContactPoint(...) will add
nodes in the connection pool.. And it will discover all the other nodes in
the connection pool automatically.. To filter out nodes only for local colo
we need to use to different settings in Java driver..

There should be similar stuff in cpp driver as well..


On Tue, Mar 4, 2014 at 8:20 AM, Manoj Khangaonkar wrote:

> Hi ,
>
> Your client/application will connect to one of the nodes from the nodes
> you tell it to connect. In the java driver this is done by calling
> Cluster.builder.addContactPoint(...). I suppose the C++ driver will have
> similar class method. For the app in DC1 provide only nodes in DC1 as
> contact points.
>
> regards
>
>
> On Tue, Mar 4, 2014 at 6:47 AM, Check Peck wrote:
>
>> I have couple of question on Datastax C++ driver.
>>
>> We have 36 nodes Cassandra cluster. 12 nodes in DC1, 12 nodes in DC2, 12
>> nodes in DC3 datacenters.
>>
>> And our application code is also in three datacenters- 11 node in DC1, 11
>> node in DC2, 11 node in DC3 datacenter.
>>
>> So my question is if the application call is coming from DC1 datacenter,
>> then will it go to DC1 Cassandra nodes automatically with the use of cpp
>> driver? And same with DC2 and DC3?
>>
>> Or we need to add some config changes in our C++ code while making
>> connection to cassandra which will then make sure if the call is coming
>> from DC1 datacenter then it will go to DC1 Cassandra nodes?
>>
>> If there is any config change which we need to add in our C++ code, then
>> can you please point me to that?
>>
>>
>
>
> --
> http://khangaonkar.blogspot.com/
>


Re: CQL and reserved keywords

2014-03-04 Thread Michael Shuler

On 03/04/2014 03:56 AM, DuyHai Doan wrote:

Hello Andy

1. "Is there a list somewhere of reserved keywords that I can access so
that I
prohibit a user entering such a column name?" ->
http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/keywords_r.html

2. you may propose a pull request ?


s/propose a pull request/submit a patch/

at:
https://issues.apache.org/jira/browse/CASSANDRA

Although there is a github mirror for cassandra, it is maintained by the 
ASF and the Cassandra project has no administrative rights to do 
anything with issues, pull requests, etc. on the github mirror.  Please, 
don't send a pull request, but submit a patch to JIRA, if you would like 
to contribute!


--
Kind regards,
Michael



Very Slow Node Startup

2014-03-04 Thread Charlie Mason
Hi All,

I have single node cluster I use for development on my local machine. After
apt package upgrades and hard reboots the node takes a very long time to
restart.

The node will always eventually come back up however it takes ages
sometimes. It seems to be CPU bound as all 4 cores are maxed out by
Cassandra. The disk IO is relativity tiny (less than 1 MB/s) considering
its running on an SSD.

At the logs start-up once took over 6 hours once. From a development point
of view its not the end of the world but should I suffer a Data Centre
outage in production this could massively delay the time to come back
on-line.

I suspect the workload might be causing it. There's 16 gig of data actually
stored in it. However one of the tables holds a message queue. Which may
well have a few hundred thousand tombstones and up to 500Kb per record.  Is
this likely to have an impact on start up time? Is there anything I can do
to mitigate it. The queries on this are fast because it knows where to
start so using the table is not an issue.

Any other suggestions to look at?

Thanks,

Charlie M


Datastax C++ driver on Windows x64

2014-03-04 Thread Green, John M (HP Education)
Has anyone successfully built the Datastax C++ driver for a Windows 64-bit 
platform?

While I've made some progress I'm still not there and wondering if I should 
give-up and use a local socket to another process (e.g., JVM or .NET runtime) 
instead.I'd prefer to use C++ because that's what the rest of the 
application is using.However, my C++ and makefile experience is very dated 
and I've never used cmake before.Still I'd be very interested to know if 
anyone had success using the C++ driver on Windows x64.

John


RE: Datastax C++ driver on Windows x64

2014-03-04 Thread Dwight Smith
Second that question

From: Green, John M (HP Education) [mailto:john.gr...@hp.com]
Sent: Tuesday, March 04, 2014 2:03 PM
To: user@cassandra.apache.org
Subject: Datastax C++ driver on Windows x64

Has anyone successfully built the Datastax C++ driver for a Windows 64-bit 
platform?

While I've made some progress I'm still not there and wondering if I should 
give-up and use a local socket to another process (e.g., JVM or .NET runtime) 
instead.I'd prefer to use C++ because that's what the rest of the 
application is using.However, my C++ and makefile experience is very dated 
and I've never used cmake before.Still I'd be very interested to know if 
anyone had success using the C++ driver on Windows x64.

John

Re: Very Slow Node Startup

2014-03-04 Thread Nate McCall
The commit log replay is single threaded, so if you have a ton of
overwrites in a whole lot of commit log (like you would with a queue
pattern) it might be backing up.

The only real work around to this right now would be to turn off durable
writes to the queue schema.

The following has some details in the context of changes to make commit log
replay multi-threaded for the 2.1 release:
https://issues.apache.org/jira/browse/CASSANDRA-3578

I also recommend poking around the process a bit via jstack and jvmtop when
this is happening just to make sure commitlog is what is holding it up.


On Tue, Mar 4, 2014 at 2:34 PM, Charlie Mason  wrote:

> Hi All,
>
> I have single node cluster I use for development on my local machine.
> After apt package upgrades and hard reboots the node takes a very long time
> to restart.
>
> The node will always eventually come back up however it takes ages
> sometimes. It seems to be CPU bound as all 4 cores are maxed out by
> Cassandra. The disk IO is relativity tiny (less than 1 MB/s) considering
> its running on an SSD.
>
> At the logs start-up once took over 6 hours once. From a development point
> of view its not the end of the world but should I suffer a Data Centre
> outage in production this could massively delay the time to come back
> on-line.
>
> I suspect the workload might be causing it. There's 16 gig of data
> actually stored in it. However one of the tables holds a message queue.
> Which may well have a few hundred thousand tombstones and up to 500Kb per
> record.  Is this likely to have an impact on start up time? Is there
> anything I can do to mitigate it. The queries on this are fast because it
> knows where to start so using the table is not an issue.
>
> Any other suggestions to look at?
>
> Thanks,
>
> Charlie M
>



-- 
-
Nate McCall
Austin, TX
@zznate

Co-Founder & Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: Datastax C++ driver on Windows x64

2014-03-04 Thread Michael Shuler

On 03/04/2014 04:22 PM, Michael Shuler wrote:

On 03/04/2014 04:12 PM, Dwight Smith wrote:

Second that question

*From:*Green, John M (HP Education) [mailto:john.gr...@hp.com]
*Sent:* Tuesday, March 04, 2014 2:03 PM
*To:* user@cassandra.apache.org
*Subject:* Datastax C++ driver on Windows x64

Has anyone successfully built the Datastax C++ driver for a Windows
64-bit platform?

While I’ve made some progress I’m still not there and wondering if I
should give-up and use a local socket to another process (e.g., JVM or
.NET runtime) instead.I’d prefer to use C++ because that’s what the
rest of the application is using.However, my C++ and makefile
experience is very dated and I’ve never used cmake before.Still I’d
be very interested to know if anyone had success using the C++ driver on
Windows x64.


http://cassci.datastax.com/job/y_cpp_driver_win32/lastBuild/consoleFull

Please, let me know, and I'll dig for some further details, if this
doesn't fully help.  I did not set this particular job up, but jenkins
runs the following batch script after git pull:


@echo off
cd C:\jenkins\workspace
mkdir y_cpp_driver_win32\bin
copy CMakeCache.txt y_cpp_driver_win32\bin
cd y_cpp_driver_win32\bin
cmake .
msbuild ALL_BUILD.vcxproj
msbuild UNINSTALL.vcxproj
msbuild INSTALL.vcxproj



I may have replied a bit too quickly - it does look like this is using 
all 32-bit libs in the includes, even though it's built on a 64-bit machine.


You might be able to touch base with the developers on the freenode 
#datastax-drivers channel.


--
Kind regards,
Michael


Re: Datastax C++ driver on Windows x64

2014-03-04 Thread Michael Shuler

On 03/04/2014 04:12 PM, Dwight Smith wrote:

Second that question

*From:*Green, John M (HP Education) [mailto:john.gr...@hp.com]
*Sent:* Tuesday, March 04, 2014 2:03 PM
*To:* user@cassandra.apache.org
*Subject:* Datastax C++ driver on Windows x64

Has anyone successfully built the Datastax C++ driver for a Windows
64-bit platform?

While I’ve made some progress I’m still not there and wondering if I
should give-up and use a local socket to another process (e.g., JVM or
.NET runtime) instead.I’d prefer to use C++ because that’s what the
rest of the application is using.However, my C++ and makefile
experience is very dated and I’ve never used cmake before.Still I’d
be very interested to know if anyone had success using the C++ driver on
Windows x64.


http://cassci.datastax.com/job/y_cpp_driver_win32/lastBuild/consoleFull

Please, let me know, and I'll dig for some further details, if this 
doesn't fully help.  I did not set this particular job up, but jenkins 
runs the following batch script after git pull:



@echo off
cd C:\jenkins\workspace
mkdir y_cpp_driver_win32\bin
copy CMakeCache.txt y_cpp_driver_win32\bin
cd y_cpp_driver_win32\bin
cmake .
msbuild ALL_BUILD.vcxproj
msbuild UNINSTALL.vcxproj
msbuild INSTALL.vcxproj


That's it  :)

--
Kind regards,
Michael


Re: Datastax C++ driver on Windows x64

2014-03-04 Thread Michael Shuler

On 03/04/2014 04:30 PM, Michael Shuler wrote:

On 03/04/2014 04:22 PM, Michael Shuler wrote:

On 03/04/2014 04:12 PM, Dwight Smith wrote:

Second that question

*From:*Green, John M (HP Education) [mailto:john.gr...@hp.com]
*Sent:* Tuesday, March 04, 2014 2:03 PM
*To:* user@cassandra.apache.org
*Subject:* Datastax C++ driver on Windows x64

Has anyone successfully built the Datastax C++ driver for a Windows
64-bit platform?

While I’ve made some progress I’m still not there and wondering if I
should give-up and use a local socket to another process (e.g., JVM or
.NET runtime) instead.I’d prefer to use C++ because that’s what the
rest of the application is using.However, my C++ and makefile
experience is very dated and I’ve never used cmake before.Still I’d
be very interested to know if anyone had success using the C++ driver on
Windows x64.


http://cassci.datastax.com/job/y_cpp_driver_win32/lastBuild/consoleFull

Please, let me know, and I'll dig for some further details, if this
doesn't fully help.  I did not set this particular job up, but jenkins
runs the following batch script after git pull:


@echo off
cd C:\jenkins\workspace
mkdir y_cpp_driver_win32\bin
copy CMakeCache.txt y_cpp_driver_win32\bin
cd y_cpp_driver_win32\bin
cmake .
msbuild ALL_BUILD.vcxproj
msbuild UNINSTALL.vcxproj
msbuild INSTALL.vcxproj



I may have replied a bit too quickly - it does look like this is using
all 32-bit libs in the includes, even though it's built on a 64-bit
machine.

You might be able to touch base with the developers on the freenode
#datastax-drivers channel.



I uploaded the CMakeCache.txt that is being copied over so you could 
peek at it, too.


http://cassci.datastax.com/userContent/y_cpp_driver_win32-config/

--
Michael


RE: Datastax C++ driver on Windows x64

2014-03-04 Thread Green, John M (HP Education)
Thanks Michael.This is the "ray of hope" I desperately needed.  I'll let 
you know how it goes.
 
-Original Message-
From: Michael Shuler [mailto:mshu...@pbandjelly.org] On Behalf Of Michael Shuler
Sent: Tuesday, March 04, 2014 2:58 PM
To: user@cassandra.apache.org
Subject: Re: Datastax C++ driver on Windows x64

On 03/04/2014 04:30 PM, Michael Shuler wrote:
> On 03/04/2014 04:22 PM, Michael Shuler wrote:
>> On 03/04/2014 04:12 PM, Dwight Smith wrote:
>>> Second that question
>>>
>>> *From:*Green, John M (HP Education) [mailto:john.gr...@hp.com]
>>> *Sent:* Tuesday, March 04, 2014 2:03 PM
>>> *To:* user@cassandra.apache.org
>>> *Subject:* Datastax C++ driver on Windows x64
>>>
>>> Has anyone successfully built the Datastax C++ driver for a Windows 
>>> 64-bit platform?
>>>
>>> While I've made some progress I'm still not there and wondering if I 
>>> should give-up and use a local socket to another process (e.g., JVM or
>>> .NET runtime) instead.I'd prefer to use C++ because that's what the
>>> rest of the application is using.However, my C++ and makefile
>>> experience is very dated and I've never used cmake before.Still I'd
>>> be very interested to know if anyone had success using the C++ 
>>> driver on Windows x64.
>>
>> http://cassci.datastax.com/job/y_cpp_driver_win32/lastBuild/consoleFu
>> ll
>>
>> Please, let me know, and I'll dig for some further details, if this 
>> doesn't fully help.  I did not set this particular job up, but 
>> jenkins runs the following batch script after git pull:
>>
>> 
>> @echo off
>> cd C:\jenkins\workspace
>> mkdir y_cpp_driver_win32\bin
>> copy CMakeCache.txt y_cpp_driver_win32\bin cd y_cpp_driver_win32\bin 
>> cmake .
>> msbuild ALL_BUILD.vcxproj
>> msbuild UNINSTALL.vcxproj
>> msbuild INSTALL.vcxproj
>> 
>
> I may have replied a bit too quickly - it does look like this is using 
> all 32-bit libs in the includes, even though it's built on a 64-bit 
> machine.
>
> You might be able to touch base with the developers on the freenode 
> #datastax-drivers channel.
>

I uploaded the CMakeCache.txt that is being copied over so you could peek at 
it, too.

http://cassci.datastax.com/userContent/y_cpp_driver_win32-config/

--
Michael


Re: using cssandra cql with php

2014-03-04 Thread Bryan Talbot
I think the options for using CQL from PHP pretty much don't exist. Those
that do are very old, haven't been updated in months, and don't support
newer CQL features. Also I don't think any of them use the binary protocol
but use thrift instead.

>From what I can tell, you'll be stuck using old CQL features from
unmaintained client drivers -- probably better to not be using CQL and PHP
together since mixing them seems pretty bad right now.


-Bryan



On Sun, Jan 12, 2014 at 11:27 PM, Jason Wee  wrote:

> Hi,
>
> operating system should not be a matter right? You just need the cassandra
> client downloaded and use it to access cassandra node. PHP?
> http://wiki.apache.org/cassandra/ClientOptions perhaps you can package
> cassandra pdo driver into rpm?
>
> Jason
>
>
> On Mon, Jan 13, 2014 at 3:02 PM, Tim Dunphy  wrote:
>
>> Hey all,
>>
>>  I'd like to be able to make calls to the cassandra database using PHP.
>> I've taken a look around but I've only found solutions out there for Ubuntu
>> and other distros. But my environment is CentOS.  Are there any packages
>> out there I can install that would allow me to use CQL in my PHP code?
>>
>> Thanks
>> Tim
>>
>> --
>> GPG me!!
>>
>> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>>
>>
>


Re: using cssandra cql with php

2014-03-04 Thread Robert McFrazier
I have started a php library that uses the cql binary protocol.

check out:
https://github.com/rmcfrazier/phpbinarycql

thanks,
Robert
On Mar 4, 2014 3:06 PM, "Bryan Talbot"  wrote:

> I think the options for using CQL from PHP pretty much don't exist. Those
> that do are very old, haven't been updated in months, and don't support
> newer CQL features. Also I don't think any of them use the binary protocol
> but use thrift instead.
>
> From what I can tell, you'll be stuck using old CQL features from
> unmaintained client drivers -- probably better to not be using CQL and PHP
> together since mixing them seems pretty bad right now.
>
>
> -Bryan
>
>
>
> On Sun, Jan 12, 2014 at 11:27 PM, Jason Wee  wrote:
>
>> Hi,
>>
>> operating system should not be a matter right? You just need the
>> cassandra client downloaded and use it to access cassandra node. PHP?
>> http://wiki.apache.org/cassandra/ClientOptions perhaps you can package
>> cassandra pdo driver into rpm?
>>
>> Jason
>>
>>
>> On Mon, Jan 13, 2014 at 3:02 PM, Tim Dunphy  wrote:
>>
>>> Hey all,
>>>
>>>  I'd like to be able to make calls to the cassandra database using PHP.
>>> I've taken a look around but I've only found solutions out there for Ubuntu
>>> and other distros. But my environment is CentOS.  Are there any packages
>>> out there I can install that would allow me to use CQL in my PHP code?
>>>
>>> Thanks
>>> Tim
>>>
>>> --
>>> GPG me!!
>>>
>>> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>>>
>>>
>>
>


Re: Datastax C++ driver on Windows x64

2014-03-04 Thread Check Peck
Hi Guys,

I have couple of question on Datastax C++ driver.. Not related to this
particular post as nobody is replying to my original email thread.. And in
this email thread I saw people talking about Datastax C++ driver.

Not sure whether you might be able to help me or not but trying my luck -

We have 36 nodes Cassandra cluster. 12 nodes in DC1, 12 nodes in DC2, 12
nodes in DC3 datacenters.

And our application code is also in three datacenters- 11 node in DC1, 11
node in DC2, 11 node in DC3 datacenter.

So my question is if the application call is coming from DC1 datacenter,
then will it go to DC1 Cassandra nodes automatically with the use of cpp
driver? And same with DC2 and DC3?

Or we need to add some config changes in our C++ code while making
connection to cassandra which will then make sure if the call is coming
from DC1 datacenter then it will go to DC1 Cassandra nodes?

If there is any config change which we need to add in our C++ code, then
can you please point me to that?


On Tue, Mar 4, 2014 at 3:01 PM, Green, John M (HP Education) <
john.gr...@hp.com> wrote:

> Thanks Michael.This is the "ray of hope" I desperately needed.  I'll
> let you know how it goes.
>
> -Original Message-
> From: Michael Shuler [mailto:mshu...@pbandjelly.org] On Behalf Of Michael
> Shuler
> Sent: Tuesday, March 04, 2014 2:58 PM
> To: user@cassandra.apache.org
> Subject: Re: Datastax C++ driver on Windows x64
>
> On 03/04/2014 04:30 PM, Michael Shuler wrote:
> > On 03/04/2014 04:22 PM, Michael Shuler wrote:
> >> On 03/04/2014 04:12 PM, Dwight Smith wrote:
> >>> Second that question
> >>>
> >>> *From:*Green, John M (HP Education) [mailto:john.gr...@hp.com]
> >>> *Sent:* Tuesday, March 04, 2014 2:03 PM
> >>> *To:* user@cassandra.apache.org
> >>> *Subject:* Datastax C++ driver on Windows x64
> >>>
> >>> Has anyone successfully built the Datastax C++ driver for a Windows
> >>> 64-bit platform?
> >>>
> >>> While I've made some progress I'm still not there and wondering if I
> >>> should give-up and use a local socket to another process (e.g., JVM or
> >>> .NET runtime) instead.I'd prefer to use C++ because that's what the
> >>> rest of the application is using.However, my C++ and makefile
> >>> experience is very dated and I've never used cmake before.Still I'd
> >>> be very interested to know if anyone had success using the C++
> >>> driver on Windows x64.
> >>
> >> http://cassci.datastax.com/job/y_cpp_driver_win32/lastBuild/consoleFu
> >> ll
> >>
> >> Please, let me know, and I'll dig for some further details, if this
> >> doesn't fully help.  I did not set this particular job up, but
> >> jenkins runs the following batch script after git pull:
> >>
> >> 
> >> @echo off
> >> cd C:\jenkins\workspace
> >> mkdir y_cpp_driver_win32\bin
> >> copy CMakeCache.txt y_cpp_driver_win32\bin cd y_cpp_driver_win32\bin
> >> cmake .
> >> msbuild ALL_BUILD.vcxproj
> >> msbuild UNINSTALL.vcxproj
> >> msbuild INSTALL.vcxproj
> >> 
> >
> > I may have replied a bit too quickly - it does look like this is using
> > all 32-bit libs in the includes, even though it's built on a 64-bit
> > machine.
> >
> > You might be able to touch base with the developers on the freenode
> > #datastax-drivers channel.
> >
>
> I uploaded the CMakeCache.txt that is being copied over so you could peek
> at it, too.
>
> http://cassci.datastax.com/userContent/y_cpp_driver_win32-config/
>
> --
> Michael
>


C++ build under Ubuntu 12.04

2014-03-04 Thread Michael Dykman
I am getting errors running the cmake file in a *very* recent download
of the C++ driver's source tree.  It seems to be failing to find
either boost::asio or openssl libraries.  I defineately have these
both installed having developed against them recently (and rechecked
with dpkg today).

While I have brushed up against cmake before, I have never had to
modify CMakeLists.txt before.  Could someone please advise me how to
adjust that filoe so it can find the external dependencies?

-- 
 - michael dykman
 - mdyk...@gmail.com

 May the Source be with you.


Re: Very Slow Node Startup

2014-03-04 Thread Robert Coli
On Tue, Mar 4, 2014 at 2:28 PM, Nate McCall  wrote:

> The commit log replay is single threaded, so if you have a ton of
> overwrites in a whole lot of commit log (like you would with a queue
> pattern) it might be backing up.
>
> The only real work around to this right now would be to turn off durable
> writes to the queue schema.
>

Or you can call "nodetool drain" every time before you shut down? This will
mark the entire commit log clean and there will be nothing to replay on
startup...

=Rob


Re: Very Slow Node Startup

2014-03-04 Thread Robert Coli
On Tue, Mar 4, 2014 at 12:34 PM, Charlie Mason wrote:

> I have single node cluster I use for development on my local machine.
> After apt package upgrades and hard reboots the node takes a very long time
> to restart.
>

Regarding "after apt package upgrades", this is yet another case for :

https://issues.apache.org/jira/browse/CASSANDRA-2356

Where your input is welcome! :D

If apt automatically restarts your Cassandra server, you don't have a
chance to run "nodetool drain" to avoid commit log replay...

=Rob


mixed nodes, some SSD some HD

2014-03-04 Thread Elliot Finley
Using Cassandra 2.0.x

If I have a 3 node cluster and 2 of the nodes use spinning drives and 1 of
them uses SSD,  will the majority of the reads be routed to the SSD node
automatically because it has faster responses?

TIA,
Elliot


Noticing really high read latency

2014-03-04 Thread Eric Plowe
Background info:

6 node cluster.
24 gigs of ram per machine
8 gigs of ram dedicated to c*
4 4 core cpu's
2 250 gig SSD's raid 0
Running c* 1.2.6

The CF is configured as followed

CREATE TABLE behaviors (
  uid text,
  buid int,
  name text,
  expires text,
  value text,
  PRIMARY KEY (uid, buid, name)
) WITH
  bloom_filter_fp_chance=0.01 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.00 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=0.10 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'sstable_size_in_mb': '160', 'class':
'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};

I am noticing that the read latency is very high considering when I look at
the output of nodetool cfstats.

This is the example output of one of the nodes:

  Column Family: behaviors
SSTable count: 2
SSTables in each level: [1, 1, 0, 0, 0, 0, 0, 0, 0]
Space used (live): 171496198
Space used (total): 171496591
Number of Keys (estimate): 1153664
Memtable Columns Count: 14445
Memtable Data Size: 1048576
Memtable Switch Count: 1
Read Count: 1894
Read Latency: 0.497 ms.
Write Count: 7169
Write Latency: 0.041 ms.
Pending Tasks: 0
Bloom Filter False Positives: 4
Bloom Filter False Ratio: 0.00862
Bloom Filter Space Used: 3533152
Compacted row minimum size: 125
Compacted row maximum size: 9887
Compacted row mean size: 365

The write latency is awesome, but the read latency, not so much. The output
of iostat doesn't show anything out of the ordinary. The cpu utilization is
between 1% to 5%.

All read queries are issued with a CL of ONE. We always include "WHERE uid
= ''" for the queries.

If there is any more info I can provide, please let me know. At this point
in time, I am a bit stumped.

Regards,

Eric Plowe