About Cassandra-Hadoop(Pig) Integration issue

2013-12-10 Thread pradeep kumar
Hello Cassandra users, For one of our our new Big data BI projects, we are using Apache Cassandra 1.2.10 as our primary data store with the support of Hadoop for analytics. For prototyping purpose we have 1 node each for Apache Cassandra/Hadoop. Pig is our choice to process the data from/to C*. B

Re: RPC timeout error while exporting data from CQL

2013-09-18 Thread pradeep kumar
Experts.. Any help? On Wed, Sep 18, 2013 at 2:55 PM, pradeep kumar wrote: > Hello all, > > I am trying to export data from cassandra using CQL client. A column > family has about 10 rows in it. when i am copying dta into csv file > using COPY TO command i get following rpc

RPC timeout error while exporting data from CQL

2013-09-18 Thread pradeep kumar
Hello all, I am trying to export data from cassandra using CQL client. A column family has about 10 rows in it. when i am copying dta into csv file using COPY TO command i get following rpc_time out error. copy mycolfamily to '/root/mycolfamily.csv' Request did not complete within rpc_timeout

Cassandra nodetool could not resolve '127.0.0.1': unknown host

2013-09-17 Thread pradeep kumar
I am very new to cassandra. Just started exploring. I am running a single node cassandra server & facing a problem in seeing status of the cassandra using nodetool command. i have hostname configured on my VM as myMachineIP cass1 in /etc/hosts and i configured my cassandra_instal_path/conf/cass

Re: Pycassa KEY read error.

2013-02-05 Thread Pradeep Kumar Mantha
qPU > > You may want to check your subscription to the pycassa mailing list; it > seems like you're not getting my responses for some reason. > > > On Tue, Feb 5, 2013 at 12:20 PM, Pradeep Kumar Mantha < > pradeep...@gmail.com> wrote: > >> Hi, >> >&

Pycassa KEY read error.

2013-02-05 Thread Pradeep Kumar Mantha
Hi, I am trying to read fields using pycassa api. But seems like I am missing something and not getting expected results. >>> pool = pycassa.ConnectionPool('usertable', server_list=['1.1.1.1']) >>> cf = pycassa.ColumnFamily(pool, 'data') >>> cf.get('7573657232323132333035343936323937363138343433'

Re: Pycassa vs YCSB results.

2013-02-05 Thread Pradeep Kumar Mantha
python. > > > I'd also closely compare the IO going on in both versions (the .write > calls). For example this may be significantly faster: > > et=time_fn() > f.write(str(colfam.get(key))+"\nTime taken for a single query is " > + str(round(1000*(et-st),

Re: Pycassa vs YCSB results.

2013-02-04 Thread Pradeep Kumar Mantha
g the pycassa performance. Please have a look at the simple python script attached and let me know your suggestions. thanks pradeep On Thu, Jan 31, 2013 at 4:53 PM, Pradeep Kumar Mantha wrote: > > > On Thu, Jan 31, 2013 at 4:49 PM, Pradeep Kumar Mantha < > pradeep...@gmail.com> wrot

Re: Pycassa vs YCSB results.

2013-01-31 Thread Pradeep Kumar Mantha
PM, Tyler Hobbs wrote: > Can you provide the python script that you're using? > > (I'm moving this thread to the pycassa mailing list ( > pycassa-disc...@googlegroups.com), which is a better place for this > discussion.) > > > On Thu, Jan 31, 2013 at 6:25 PM

Re: Cassandra Performance Benchmarking.

2013-01-21 Thread Pradeep Kumar Mantha
nds is very suspicious. I can't > debug your script over the mailing list, but do some sanity checks to make > sure there's not a bottleneck somewhere you don't expect. > > > On Fri, Jan 18, 2013 at 12:44 PM, Pradeep Kumar Mantha > wrote: >> >> Hi, >>

Re: Cassandra Performance Benchmarking.

2013-01-18 Thread Pradeep Kumar Mantha
tionPool size to handle the number of > threads you have using it concurrently. Set the pool_size kwarg to at least > the number of threads you're using. > > > On Thu, Jan 17, 2013 at 6:46 PM, Pradeep Kumar Mantha > wrote: >> >> Thanks Tyler. >> >> I ju

Re: Cassandra Performance Benchmarking.

2013-01-17 Thread Pradeep Kumar Mantha
etwork latency, you'll top out on python performance with > a fairly low number of threads due to the GIL. It's best to use multiple > processes if you really want to benchmark something. > > > On Thu, Jan 17, 2013 at 6:05 PM, Pradeep Kumar Mantha > wrote: >> &g

Re: Cassandra Performance Benchmarking.

2013-01-17 Thread Pradeep Kumar Mantha
adge for that. > > You should use the built in stress tool or YCSB. > > The CLI has to do much more string conversion then a normal client would and > it is not built for performance. You will definitely get better numbers > through other means. > > On Thu, Jan 17, 2013 at

Cassandra Performance Benchmarking.

2013-01-17 Thread Pradeep Kumar Mantha
Hi, I am trying to maximize execution of the number of read queries/second. Here is my cluster configuration. Replication - Default 12 Data Nodes. 16 Client Nodes - used for querying. Each client node executes 32 threads - each thread executes 76896 read queries using cassandra-cli tool.

Re: Loading sstables to Cassandra using sstableloader and JMX client

2012-12-20 Thread Pradeep Kumar Mantha
Hi, The directory information should contain entire path to the sstables location. 'C:\Anand\Workspace\H2C_POC\Customer\. I assume customer is the keyspace. Hope it helps. thanks pradeep On Thu, Dec 20, 2012 at 6:15 AM, wrote: > Hi > > > > I am working on options to load my sstables to loa

Re: Loading SSTables failing via Cassandra SSTableLoader on mulit-node cluster.

2012-12-05 Thread Pradeep Kumar Mantha
ion for the sstableloader ? Background > configuration section here http://www.datastax.com/dev/blog/bulk-loading > > > Hope that helps. > > - > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com >

Fwd: Loading SSTables failing via Cassandra SSTableLoader on mulit-node cluster.

2012-12-04 Thread Pradeep Kumar Mantha
Hi! I am trying to load sstables generated onto a running multi-node Cassandra cluster. But I see problems only with multi-cluster and single node works fine. Cassandra version used is 1.1.2 . The cassandra cluster seems to be active. -bash-3.2$ nodetool -host 129.56.57.45 -p 7199 ring Address