Why I can not do a "count(*) ... allow filtering " without facing operation timeout?

2015-09-04 Thread shahab
Hi, This is probably a silly problem , but it is really serious for me. I have a cluster of 3 nodes, with replication factor 2. But still I can not do a simple "select count(*) from ..." neither using DevCenter nor "cqlsh" . Any idea how this can be done? best, /Shahab

How to export query results (milions rows) as CSV fomat?

2015-07-07 Thread shahab
?!!! does any have any idea how to do this? best, /Shahab

How to measure disk space used by a keyspace?

2015-06-29 Thread shahab
Hi, Probably this question has been already asked in the mailing list, but I couldn't find it. The question is how to measure disk-space used by a keyspace, column family wise, excluding snapshots? best, /Shahab

Re: How to store denormalized data

2015-06-03 Thread Shahab Yunus
e solution? What harm in it? Also, you can slightly change it, (if applicable) and not populate as a separate batch process but in fact make part of your analysis job? Kind of a pre-process/prep step? Regards, Shahab On Wed, Jun 3, 2015 at 10:48 AM, Matthew Johnson wrote: > Hi all, > >

Re: Data model suggestions

2015-04-26 Thread Shahab Yunus
Interesting approach Oded. Is this something similar that has been described here: http://radar.oreilly.com/2014/07/questioning-the-lambda-architecture.html Regards, Shahab On Sun, Apr 26, 2015 at 4:29 AM, Peer, Oded wrote: > I would maintain two tables. > > An “archive” table that

Getting " ParNew GC in ... CMS Old Gen ... " in logs

2015-04-20 Thread shahab
312; Par Eden Space: 167712624 -> 0; Par Survivor Space: 0 -> 20970080 Is above line is indication of something that need to be fixed in the system?? how can I resolve this? best, /Shahab

Re: best supported spark connector for Cassandra

2015-02-11 Thread shahab
I am using Calliope cassandra-spark connector( http://tuplejump.github.io/calliope/), which is quite handy and easy to use! The only problem is that it is a bit outdates , works with Spark 1.1.0, hopefully new version comes soon. best, /Shahab On Wed, Feb 11, 2015 at 2:51 PM, Marcelo Valle

Why RDD is not cached?

2014-10-27 Thread shahab
n results. Any idea what I am missing in my settings, or... ? thanks, /Shahab

Re: Increasing size of "Batch of prepared statements"

2014-10-23 Thread shahab
Thanks Tyler for sharing this. It is exactly what I was looking for to know. best, /Shahab On Thu, Oct 23, 2014 at 5:37 PM, Tyler Hobbs wrote: > CASSANDRA-8091 (Stress tool creates too large batches) is relevant: > https://issues.apache.org/jira/browse/CASSANDRA-8091 > > On Thu,

Re: Increasing size of "Batch of prepared statements"

2014-10-23 Thread shahab
OK, Thanks again Jens. best, /Shahab On Thu, Oct 23, 2014 at 1:22 PM, Jens Rantil wrote: > Hi again Shabab, > > Yes, it seems that way. I have no experience with the “cassandra stress > tool”, but wouldn’t be surprised if the batch size could be tweaked. > > Cheers, > Jen

Re: Increasing size of "Batch of prepared statements"

2014-10-23 Thread shahab
Thanks Jens for the comments. As I am trying "cassandra stress tool", does it mean that the tool is executing batch of "Insert" statements (probably hundreds, or thousands) to the cassandra (for the sake of stressing Cassnadra ? best, /Shahab On Wed, Oct 22, 2014 at 8:14 PM

Re: Increasing size of "Batch of prepared statements"

2014-10-06 Thread shahab
with large size? best, /Shahab On Sun, Oct 5, 2014 at 6:03 PM, Jens Rantil wrote: > Shabab, > If you are hitting this limit because you are inserting a lot of (CQL) > rows in a single batch I suggest you split the statement up in multiple > smaller batches. Generally, large inserts l

Why results of Cassandra Stress Toll is much worse than normal reading/writing from Cassandra?

2014-10-05 Thread shahab
ague documentation there. I do appreciate of any one could help me to understand the output of Stress-Tool? BTW, I have already seen this one (but still the documentation is quite poor): http://www.datastax.com/documentation/cassandra/2.1/cassandra/tools/toolsCStressOutput_c.html best, /Shahab

Re: Increasing size of "Batch of prepared statements"

2014-10-05 Thread shahab
Thanks Shane. best, /Shahab On Fri, Oct 3, 2014 at 6:51 PM, Shane Hansen wrote: > It appears to be configurable in cassandra.yaml > using batch_size_warn_threshold > > https://issues.apache.org/jira/browse/CASSANDRA-6487 > > > On Fri, Oct 3, 2014 at 10:47 AM, shahab wrote

Increasing size of "Batch of prepared statements"

2014-10-03 Thread shahab
there any way to change the default value? thanks /Shahab

cassandra stress tools

2014-10-01 Thread shahab
s outdated, there is an output parameter "partition_rate" which is not explained in documentation? best, /Shahab

Regarding Cassandra-Stress tool

2014-10-01 Thread shahab
s outdated, there is an output parameter "partition_rate" which is not explained in documentation? best, /Shahab

Re: using dynamic cell names in CQL 3

2014-09-25 Thread shahab
QL 3 best, /Shahab On Wed, Sep 24, 2014 at 1:49 PM, DuyHai Doan wrote: > Dynamic thing in Thrift ≈ clustering columns in CQL > > Can you give more details about your data model ? > > On Wed, Sep 24, 2014 at 1:11 PM, shahab wrote: > >> Hi, >> >> I would like to de

using dynamic cell names in CQL 3

2014-09-24 Thread shahab
ource/example that I can look at ? best, /Shahab

Re: Machine Learning With Cassandra

2014-08-30 Thread Shahab Yunus
and thus you can run complex ML algorithms relatively faster. I think we just discussed this a short while ago when similar question (storm vs. spark, I think) was raised by you earlier. Here is the link for that discussion: http://markmail.org/message/lc4icuw4hobul6oh Regards, Shahab On Sat

Re: Why "select count("*) from .." hangs ?

2014-03-26 Thread shahab
Thanks for the hints. I got a better picture of how to deal with "count" queries. On Tue, Mar 25, 2014 at 7:01 PM, Robert Coli wrote: > On Tue, Mar 25, 2014 at 8:36 AM, shahab wrote: > >> But after iteration 8, (i.e. inserting 150 sensor data), the >> "

Re: Why "select count("*) from .." hangs ?

2014-03-25 Thread shahab
AND memtable_flush_period_in_ms=0 AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; On Tue, Mar 25, 2014 at 4:58 PM, Michael Shuler wrote: > On 03/25/2014 10:36 AM, shahab wrote: > >> In our applicatio

Why "select count("*) from .." hangs ?

2014-03-25 Thread shahab
)..." using Datastax DevCenter GUI, but I got same result. I am sure that I have missed something or misunderstood how Cassandra works, but don't know really what? I do appreciate any hints. best, /Shahab

Re: Modeling multi-tenanted Cassandra schema

2013-11-13 Thread Shahab Yunus
Nate, (slightly OT), what client API/library is recommended now that Hector is sunsetting? Thanks. Regards, Shahab On Wed, Nov 13, 2013 at 9:28 AM, Nate McCall wrote: > You basically want option (c). Option (d) might work, but you would be > bending the paradigm a bit, IMO. Certainly

Re: Deleting data using timestamp

2013-10-09 Thread Shahab Yunus
Ahh, yes, 'compaction'. I blanked out while mentioning repair and cleanup. That is in fact what needs to be done first and what I meant. Thanks Robert. Regards, Shahab On Wed, Oct 9, 2013 at 1:50 PM, Robert Coli wrote: > On Wed, Oct 9, 2013 at 7:35 AM, Ravikumar

Re: Deleting data using timestamp

2013-10-09 Thread Shahab Yunus
I might be missing something obvious here but can't you afford (time-wise) to run cleanup or repair after the deletion so that the deleted data is gone? Assuming that your columns are time-based data? Regards, Shahab On Wed, Oct 9, 2013 at 10:35 AM, Ravikumar Govindarajan < ravikumar.g

Re: Deleting Row Key

2013-10-05 Thread Shahab Yunus
tool Regards, Shahab On Sat, Oct 5, 2013 at 7:06 PM, Shahab Yunus wrote: > Yes you can: > > http://hbase.apache.org/book/regions.arch.html#compaction > http://hbase.apache.org/book/important_configurations.html (Managed > Compaction section) > > Regards, > Shahab

Re: Deleting Row Key

2013-10-05 Thread Shahab Yunus
Yes you can: http://hbase.apache.org/book/regions.arch.html#compaction http://hbase.apache.org/book/important_configurations.html (Managed Compaction section) Regards, Shahab On Sat, Oct 5, 2013 at 6:02 PM, Sebastian Schmidt wrote: > Am 06.10.2013 00:00, schrieb Cem Cayiroglu: > > I

Re: get float column in cassandra mapreduce

2013-10-05 Thread Shahab Yunus
temperature' or 'temprature'? You are using the latter in your code and if it is not what is in the data then you might be trying to parse empty or malformed string. Regards, Shahab On Sat, Oct 5, 2013 at 5:16 AM, Anseh Danesh wrote: > Hi all... I have a question. in the cassandr

Re: questions related to the SSTable file

2013-09-17 Thread Shahab Yunus
Thanks Robert for the answer. It makes sense. If that happens then it means that your design or use case needs some rework ;) Regards, Shahab On Tue, Sep 17, 2013 at 2:37 PM, java8964 java8964 wrote: > Another question related to the SSTable files generated in the incremental > backup

Re: questions related to the SSTable file

2013-09-17 Thread Shahab Yunus
SSTable? I am also interesting in knowing the answer. Regards, Shahab On Tue, Sep 17, 2013 at 9:50 AM, java8964 java8964 wrote: > Thanks Dean for clarification. > > But if I put hundreds of megabyte data of one row through one put, what > you mean is Cassandra will put all of them into

Re: Cassandra nodetool could not resolve '127.0.0.1': unknown host

2013-09-17 Thread Shahab Yunus
Have you tried specifying your hostname (not localhost) in cassandra.yaml and start it? Regards, Shahab On Tue, Sep 17, 2013 at 8:39 AM, pradeep kumar wrote: > I am very new to cassandra. Just started exploring. > > I am running a single node cassandra server & facing a prob

Re: VMs versus Physical machines

2013-09-12 Thread Shahab Yunus
ore columns at a time. Regards, Shahab On Thu, Sep 12, 2013 at 1:51 AM, Aaron Turner wrote: > > > > > On Wed, Sep 11, 2013 at 4:40 PM, Shahab Yunus wrote: > >> Thanks Aaron for the reply. Yes, VMs or the nodes will be in cloud if we >> don't go the phys

Re: VMs versus Physical machines

2013-09-11 Thread Shahab Yunus
s no difference whether we use physical or VMs (in cloud)? Regards, Shahab On Wed, Sep 11, 2013 at 7:34 PM, Aaron Turner wrote: > Physical machines unless you're running your cluster in the cloud > (AWS/etc). > > Reason is simple: Look how Cassandra scales and provides red

VMs versus Physical machines

2013-09-11 Thread Shahab Yunus
. Data size? Writing speed (whether write heavy usecases or not)? Random ead use-cases? column family design/how we store data? Any pointers, documents, guidance, advise would be appreciated. Thanks a lot. Regards, Shahab

Re: Help on Cassandra Limitaions

2013-09-06 Thread Shahab Yunus
Also, Sylvain, you have couple of great posts about relationships between CQL3/Thrift entities and naming issues: http://www.datastax.com/dev/blog/cql3-for-cassandra-experts http://www.datastax.com/dev/blog/thrift-to-cql3 I always refer to them when I get confuse :) Regards, Shahab On Fri

Re: Cassandra Reads

2013-09-06 Thread Shahab Yunus
://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html?pagename=docs&version=1.2&file=#cassandra/dml/dml_about_reads_c.html http://www.roman10.net/how-apache-cassandra-read-works/ http://wiki.apache.org/cassandra/ArchitectureInternals Regards, Shahab On Fri, Sep 6, 2013 at 6:28 AM,

Re: Secondary Indexes On Partitioned Time Series Data Question

2013-08-01 Thread Shahab Yunus
Thanks a lot. Regards, Shahab On Thu, Aug 1, 2013 at 8:32 PM, Robert Coli wrote: > On Thu, Aug 1, 2013 at 2:34 PM, Shahab Yunus wrote: > >> Can you shed some more light (or point towards some other resource) that >> why you think built-in Secondary Indexes should not

Re: Secondary Indexes On Partitioned Time Series Data Question

2013-08-01 Thread Shahab Yunus
Hi Robert, Can you shed some more light (or point towards some other resource) that why you think built-in Secondary Indexes should not be used easily or without much consideration? Thanks. Regards, Shahab On Thu, Aug 1, 2013 at 3:53 PM, Robert Coli wrote: > On Thu, Aug 1, 2013 at 12:49

Re: VM dimensions for running Cassandra and Hadoop

2013-07-31 Thread Shahab Yunus
Hi Jan, One question...you say "- I must make sure the disks are directly attached, to prevent problems when multiple nodes flush the commit log at the same time" What do you mean by that? Thanks, Shahab On Wed, Jul 31, 2013 at 3:10 AM, Jan Algermissen wrote: > Jon, >

Re: MapReduce response time and speed

2013-07-24 Thread Shahab Yunus
tune parameters. Regards, Shahab On Wed, Jul 24, 2013 at 10:33 AM, Jan Algermissen < jan.algermis...@nordsc.com> wrote: > Hi, > > I am Jan Algermissen (REST-head, freelance programmer/consultant) and > Cassandra-newbie. > > I am looking at Cassandra for an application I a

Re: Representation of dynamically added columns in table (column family) schema using cqlsh

2013-07-23 Thread Shahab Yunus
See this as this was discussed earlier: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Representation-of-dynamically-added-columns-in-table-column-family-schema-using-cqlsh-td7588997.html Regards, Shahab On Fri, Jul 12, 2013 at 11:13 AM, Shahab Yunus wrote: > A basic quest

Re: Unable to describe table in CQL 3

2013-07-23 Thread Shahab Yunus
Rahul, See this as it was discussed earlier: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Representation-of-dynamically-added-columns-in-table-column-family-schema-using-cqlsh-td7588997.html Regards, Shahab On Tue, Jul 23, 2013 at 2:51 PM, Rahul Gupta wrote: > I am us

Re: Auto Discovery of Hosts by Clients

2013-07-22 Thread Shahab Yunus
Thanks for you replies. Regards, Shahab On Sun, Jul 21, 2013 at 4:49 PM, aaron morton wrote: > Give the app the same nodes you have in the seed lists. > > Cheers > > - > Aaron Morton > Cassandra Consultant > New Zealand > > @aaronmorton > http:

Re: Socket buffer size

2013-07-20 Thread Shahab Yunus
I think the former is for client communication to the nodes and the latter for communication between nodes themselves as evident by the name of the property. Please feel free to correct me if I am wrong. Regards, Shahab On Saturday, July 20, 2013, Mohammad Hajjat wrote: > Hi, > > W

Auto Discovery of Hosts by Clients

2013-07-19 Thread Shahab Yunus
aries with the client API that I am using? Thanks a lot. Regards, Shahab

Re: IllegalArgumentException on query with AbstractCompositeType

2013-07-13 Thread Shahab Yunus
Aaron Morton can confirm but I think one problem could be that to create an index on a field with small number of possible values is not good. Regards, Shahab On Sat, Jul 13, 2013 at 9:14 AM, Tristan Seligmann wrote: > On Fri, Jul 12, 2013 at 10:38 AM, aaron morton wrote: > >> CRE

Re: Representation of dynamically added columns in table (column family) schema using cqlsh

2013-07-12 Thread Shahab Yunus
Thanks Eric for the explanation. Regards, Shahab On Fri, Jul 12, 2013 at 11:13 AM, Shahab Yunus wrote: > A basic question and it seems that I have a gap in my understanding. > > I have a simple table in Cassandra with multiple column families. I add > new columns to each of

Representation of dynamically added columns in table (column family) schema using cqlsh

2013-07-12 Thread Shahab Yunus
n this one column1 or value using the 'SELECT' statement. The OpsCenter on the other hand, displays multiple columns as expected. Basically the demarcation of multiple columns i clearer. Thanks a lot. Regards, Shahab

Re: what happen if coordinator node fails during write

2013-06-29 Thread Shahab Yunus
hanks. Regards, Shahab On Friday, June 28, 2013, aaron morton wrote: > As far as I know in 1.2 coordinator logs request before it updates > replicas. > > You may be thinking about atomic batches, which are enabled by default for > 1.2 via CQL but must be supported by Thrift clients.

Re: block size

2013-06-20 Thread Shahab Yunus
hile developing with) Cassandra unlike Hadoop. Regards, Shahab On Thu, Jun 20, 2013 at 3:38 PM, Kanwar Sangha wrote: > Yes. Is that not specific to hadoop with CFS ? I want to know that If I > have a data in column of size 500KB, how many IOPS are needed to read that > ? (assum

Re: block size

2013-06-20 Thread Shahab Yunus
Have you seen this? http://www.datastax.com/dev/blog/cassandra-file-system-design Regards, Shahab On Thu, Jun 20, 2013 at 3:17 PM, Kanwar Sangha wrote: > Hi – What is the block size for Cassandra ? is it taken from the OS > defaults ? >

Re: Unit Testing Cassandra

2013-06-19 Thread Shahab Yunus
Thanks Edward, Ben and Dean for the pointers. Yes, I am using Java and these sounds promising for unit testing, at least. Regards, Shahab On Wed, Jun 19, 2013 at 9:58 AM, Edward Capriolo wrote: > You really do not need much in java you can use the embedded server. > Hector wrap a simple

Re: Unit Testing Cassandra

2013-06-19 Thread Shahab Yunus
e for now. I do see some stuff out there but wanted to know recommendations from the community given their experience. Regards, Shahab On Wed, Jun 19, 2013 at 3:15 AM, Stephen Connolly < stephen.alan.conno...@gmail.com> wrote: > Unit testing means testing in isolation the smallest

Re: Dropped mutation messages

2013-06-19 Thread Shahab Yunus
Hello Arthur, What do you mean by "The queries need to be lightened"? Thanks, Shahb On Tue, Jun 18, 2013 at 8:47 PM, Arthur Zubarev wrote: > Cem hi, > > as per http://wiki.apache.org/cassandra/FAQ#dropped_messages > > > Internode messages which are received by a node, but do not get not to b

Unit Testing Cassandra

2013-06-18 Thread Shahab Yunus
Hello, Can anyone suggest a good/popular Unit Test tools/frameworks/utilities out there for unit testing Cassandra stores? I am looking for testing from performance/load and monitoring perspective. I am using 1.2. Thanks a lot. Regards, Shahab

Re: Dynamic Columns Question Cassandra 1.2.5, Datastax Java Driver 1.0

2013-06-06 Thread Shahab Yunus
Dynamic columns are not supported in CQL3. We just had a discussion a day or two ago about this where Eric Stevens explained it. Please see this: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/CQL-3-returning-duplicate-keys-td7588181.html Regards, Shahab On Thu, Jun 6, 2013 at

Re: Multiple JBOD data directory

2013-06-05 Thread Shahab Yunus
Though, I am a newbie bust just had a thought regarding your question 'How will it handle requests for data which unavailable?', wouldn't the data be served in that case from other nodes where it has been replicated? Regards, Shahab On Wed, Jun 5, 2013 at 5:32 AM, Christo

Re: CQL 3 returning duplicate keys

2013-06-05 Thread Shahab Yunus
Thanks Eric. Yeah, I was asking about the second limitation (about dynamic columns) and you have explained it well along with pointers to read further. Regards, Shahab On Wed, Jun 5, 2013 at 8:18 AM, Eric Stevens wrote: > I mentioned a few limitations, so I'm not sure which you

Re: CQL 3 returning duplicate keys

2013-06-04 Thread Shahab Yunus
Thanks Eric for the detailed explanation but can you point to a source or document for this restriction in CQL3 tables? Doesn't it take away the main feature of the NoSQL store? Or am I am missing something obvious here? Regards, Shahab On Tue, Jun 4, 2013 at 2:12 PM, Eric Stevens wrote: