date:20110419

Re: pig + hadoop

2011-04-19 Thread Jeremy Hanna

Just as an example: cassandra.thrift.address 10.12.34.56 cassandra.thrift.port 9160 cassandra.partitioner.class org.apache.cassandra.dht.RandomPartitioner On Apr 19, 2011, at 10:28 PM, Jeremy Hanna wrote: > oh yeah - that's what's going on. what I do i

Re: pig + hadoop

2011-04-19 Thread Jeremy Hanna

oh yeah - that's what's going on. what I do is on the machine that I run the pig script from, I set the PIG_CONF variable to my HADOOP_HOME/conf directory and in my mapred-site.xml file found there, I set the three variables. I don't use environment variables when I run against a cluster. On A

RE: pig + hadoop

2011-04-19 Thread Jeffrey Wang

Did you set PIG_RPC_PORT in your hadoop-env.sh? I was seeing this error for a while before I added that. -Jeffrey From: pob [mailto:peterob...@gmail.com] Sent: Tuesday, April 19, 2011 6:42 PM To: user@cassandra.apache.org Subject: Re: pig + hadoop Hey Aaron, I read it, and all of 3 env variabl

Re: Cassandra 0.7.4 and LOCAL_QUORUM Consistency level

2011-04-19 Thread Oleg Tsvinev

Makes it clear! Thank you Jonathan. On Tue, Apr 19, 2011 at 7:02 PM, Jonathan Ellis wrote: > It doesn't make a lot of sense in general to allow those w/ non-NTS, > but it should be possible (e.g. if you've manually interleaved nodes > with ONTS so you know how many replicas are in each DC). > > P

Re: pig + hadoop

2011-04-19 Thread pob

and one more thing... 2011-04-20 04:09:23,412 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_201104200406_0001/attempt_201104200406_0001_m_02_0/output/file.out in any of the configured local directories

Re: Cassandra 0.7.4 and LOCAL_QUORUM Consistency level

2011-04-19 Thread Jonathan Ellis

It doesn't make a lot of sense in general to allow those w/ non-NTS, but it should be possible (e.g. if you've manually interleaved nodes with ONTS so you know how many replicas are in each DC). Patch attached to https://issues.apache.org/jira/browse/CASSANDRA-2516 On Tue, Apr 19, 2011 at 8:39 PM

Re: pig + hadoop

2011-04-19 Thread pob

Thats from jobtracker: 2011-04-20 03:36:39,519 INFO org.apache.hadoop.mapred.JobInProgress: Choosing rack-local task task_201104200331_0002_m_00 2011-04-20 03:36:42,521 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201104200331_0002_m_00_3: java.lang.NumberFormatExcepti

Re: Multi-DC Deployment

2011-04-19 Thread Terje Marthinussen

If you have RF=3 in both datacenters, it could be discussed if there is a point to use the built in replication in Cassandra at all vs. feeding the data to both datacenters and get 2 100% isolated cassandra instances that cannot replicate sstable corruptions between each others My point is rea

Re: Cassandra 0.7.4 and LOCAL_QUORUM Consistency level

2011-04-19 Thread Oleg Tsvinev

Ah, OK. Thank you Aaron, I'll try that. On Tue, Apr 19, 2011 at 6:39 PM, aaron morton wrote: > You need to be using NTS. > > When NetworkTopologySetting is used it overrides the > AbstractReplicationStrategy.getWriteResponseHandler() function in your stack > and returns a either a DataCentreWri

Re: pig + hadoop

2011-04-19 Thread pob

ad2. it works with -x local , so there cant be issue with pig->DB(Cassandra). im using pig-0.8 from official site + hadoop-0.20.2 from offic. site. thx 2011/4/20 aaron morton > Am guessing but here goes. Looks like the cassandra RPC port is not set, > did you follow these steps in contrib/pi

Re: pig + hadoop

2011-04-19 Thread pob

Hey Aaron, I read it, and all of 3 env variables was exported. The results are same. Best, P 2011/4/20 aaron morton > Am guessing but here goes. Looks like the cassandra RPC port is not set, > did you follow these steps in contrib/pig/README.txt > > Finally, set the following as environment va

Re: Cassandra 0.7.4 and LOCAL_QUORUM Consistency level

2011-04-19 Thread aaron morton

You need to be using NTS. When NetworkTopologySetting is used it overrides the AbstractReplicationStrategy.getWriteResponseHandler() function in your stack and returns a either a DataCentreWriteResponseHandler for LOCAL_QUORUM or DatacenterSyncWriteResponseHandler for EACH_QUORUM . They are DC

Re: pig + hadoop

2011-04-19 Thread aaron morton

Am guessing but here goes. Looks like the cassandra RPC port is not set, did you follow these steps in contrib/pig/README.txt Finally, set the following as environment variables (uppercase, underscored), or as Hadoop configuration variables (lowercase, dotted): * PIG_RPC_PORT or cassandra.thrift.

Re: Cassandra 0.7.4 and LOCAL_QUORUM Consistency level

2011-04-19 Thread William Oberman

Good point, should have read your message (and the code) more closely! Sent from my iPhone On Apr 19, 2011, at 9:16 PM, Oleg Tsvinev wrote: > I'm puzzled because code does not even check for LOCAL_QUORUM before > throwing exception. > Indeed I did not configure NetworkTopologyStrategy. Are you

Re: Tombstones and memtable_operations

2011-04-19 Thread aaron morton

Thats what I was looking for, thanks. At first glance the behaviour looks inconsistent, we count the number of columns in the delete mutation. But when deleting a row the column count is zero. I'll try to take a look later. In the mean time you can force a memtable via JConsole, navigate down

Re: Cassandra 0.7.4 and LOCAL_QUORUM Consistency level

2011-04-19 Thread Oleg Tsvinev

I'm puzzled because code does not even check for LOCAL_QUORUM before throwing exception. Indeed I did not configure NetworkTopologyStrategy. Are you saying that it works after configuring it? On Tue, Apr 19, 2011 at 6:04 PM, William Oberman wrote: > I had a similar error today when I tried using

Re: Cassandra 0.7.4 and LOCAL_QUORUM Consistency level

2011-04-19 Thread William Oberman

I had a similar error today when I tried using LOCAL_QUORUM without having a properly configured NetworkTopologyStrategy. QUORUM worked fine however. will On Tue, Apr 19, 2011 at 8:52 PM, Oleg Tsvinev wrote: > Earlier I've posted the same message to a hector-users list. > > Guys, > > I'm a bit

Cassandra 0.7.4 and LOCAL_QUORUM Consistency level

2011-04-19 Thread Oleg Tsvinev

Earlier I've posted the same message to a hector-users list. Guys, I'm a bit puzzled today. I'm using just released Hector 0.7.0-29 (thank you, Nate!) and Cassandra 0.7.4 and getting the exception below, marked as (1) Exception. When I dig to Cassandra source code below, marked as (2) Cassandra s

Re: pycassa + celery

2011-04-19 Thread pob

Hello, yeah, the bug was in my code because i use CL.ONE (so sometimes i got incomplete data) Thanks. 2011/4/14 aaron morton > This is going to be a bug in your code, so it's a bit tricky to know but... > > How / when is the email added to the DB? > What does the rawEmail function do ? > Set a

pig + hadoop

2011-04-19 Thread pob

Hello, I did cluster configuration by http://wiki.apache.org/cassandra/HadoopSupport. When I run pig example-script.pig -x local, everything is fine and i get correct results. Problem is occurring with -x mapreduce Im getting those errors :> 2011-04-20 01:24:21,791 [main] ERROR org.apache.pig.

Re: Tombstones and memtable_operations

2011-04-19 Thread Héctor Izquierdo Seliva

El mié, 20-04-2011 a las 09:08 +1200, aaron morton escribió: > Yes, I saw that. > > Wanted to know what "issue deletes through pelops" means so I can work out > what command it's sending to cassandra and hopefully I don't waste my time > looking in the wrong place. > > Aaron > Oh, sorry. Di

Re: Tombstones and memtable_operations

2011-04-19 Thread aaron morton

Yes, I saw that. Wanted to know what "issue deletes through pelops" means so I can work out what command it's sending to cassandra and hopefully I don't waste my time looking in the wrong place. Aaron On 20 Apr 2011, at 09:04, Héctor Izquierdo Seliva wrote: > I poste it a couple of messages

Re: Problems with subcolumn retrieval after upgrade from 0.6 to 0.7

2011-04-19 Thread aaron morton

Can you show what comes back from calling Column.getName() Aaron On 20 Apr 2011, at 09:00, aaron morton wrote: > Can you provide a little more info on what I'm seeing here. When name is > shown for the column, are you showing me the entire byte buffer for the name > or just up to limit ? > >

Re: Tombstones and memtable_operations

2011-04-19 Thread Héctor Izquierdo Seliva

I poste it a couple of messages back, but here it is again: I'm using 0.7.4. I have a file with all the row keys I have to delete (around 100 million) and I just go through the file and issue deletes through pelops. Should I manually issue flushes with a cron every x time?

Re: Tombstones and memtable_operations

2011-04-19 Thread aaron morton

How do you do the deletes ? Aaron On 20 Apr 2011, at 08:39, Héctor Izquierdo Seliva wrote: > El mar, 19-04-2011 a las 23:33 +0300, shimi escribió: >> You can use memtable_flush_after_mins instead of the cron >> >> >> Shimi >> > > Good point! I'll try that. > > Wouldn't it be better to count

Re: Problems with subcolumn retrieval after upgrade from 0.6 to 0.7

2011-04-19 Thread aaron morton

Can you provide a little more info on what I'm seeing here. When name is shown for the column, are you showing me the entire byte buffer for the name or just up to limit ? Aaron On 20 Apr 2011, at 05:49, Abraham Sanderson wrote: > Ok, set up a unit test for the supercolumns which seem to have

Re: Tombstones and memtable_operations

2011-04-19 Thread Héctor Izquierdo Seliva

El mar, 19-04-2011 a las 23:33 +0300, shimi escribió: > You can use memtable_flush_after_mins instead of the cron > > > Shimi > Good point! I'll try that. Wouldn't it be better to count a delete as a one column operation so it contributes to flush by operations? > 2011/4/19 Héctor Izquierdo S

Re: Tombstones and memtable_operations

2011-04-19 Thread shimi

You can use memtable_flush_after_mins instead of the cron Shimi 2011/4/19 Héctor Izquierdo Seliva > > El mié, 20-04-2011 a las 08:16 +1200, aaron morton escribió: > > I think their may be an issue here, we are counting the number of columns > in the operation. When deleting an entire row we do

Re: Tombstones and memtable_operations

2011-04-19 Thread Héctor Izquierdo Seliva

El mié, 20-04-2011 a las 08:16 +1200, aaron morton escribió: > I think their may be an issue here, we are counting the number of columns in > the operation. When deleting an entire row we do not have a column count. > > Can you let us know what version you are using and how you are doing the >

Re: How to warm up a cold node

2011-04-19 Thread Héctor Izquierdo Seliva

El mié, 20-04-2011 a las 07:59 +1200, aaron morton escribió: > The dynamic snitch only reduces the chance that a node used in a read > operation, it depends on the RF, the CL for the operation, the partitioner > and possibly the network topology. Dropping read messages is ok, so long as > your o

Re: Tombstones and memtable_operations

2011-04-19 Thread aaron morton

I think their may be an issue here, we are counting the number of columns in the operation. When deleting an entire row we do not have a column count. Can you let us know what version you are using and how you are doing the delete ? Thanks Aaron On 20 Apr 2011, at 04:21, Héctor Izquierdo Sel

Re: How to warm up a cold node

2011-04-19 Thread aaron morton

The dynamic snitch only reduces the chance that a node used in a read operation, it depends on the RF, the CL for the operation, the partitioner and possibly the network topology. Dropping read messages is ok, so long as your operation completes at the requested CL. Are you using either a key

Re: Problems with subcolumn retrieval after upgrade from 0.6 to 0.7

2011-04-19 Thread Abraham Sanderson

Ok, set up a unit test for the supercolumns which seem to have problems, I posted a few examples below. As I mentioned, the retrieved bytes for the name and value appear to have additional data; in previous tests the buffer's position, mark, and limit have been verified, and when I call column.get

Re: AW: AW: Two versions of schema

2011-04-19 Thread mcasandra

What would be the procedure in this case? Run drain on the node that is disagreeing? But is it enough to run just drain or you suggest drain + rm system files? -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Two-versions-of-schema-tp6277365p628786

Re: Tombstones and memtable_operations

2011-04-19 Thread Héctor Izquierdo Seliva

Ok, I've read about gc grace seconds, but i'm not sure I understand it fully. Untill gc grace seconds have passed, and there is a compaction, the tombstones live in memory? I have to delete 100 million rows and my insert rate is very low, so I don't have a lot of compactions. What should I do in th

Tombstones and memtable_operations

2011-04-19 Thread Héctor Izquierdo Seliva

Hi everyone. I've configured in one of my column families memtable_operations = 0.02 and started deleting keys. I have already deleted 54k, but there hasn't been any flush of the memtable. Memory keeps pilling up and eventually nodes start to do stop-the-world GCs. Is this the way this is supposed

AW: AW: Two versions of schema

2011-04-19 Thread Roland Gude

Yeah it happens from time to time even if everything seems to be fine that schema changes don't work correctly. But it's always repairable with the described procedure. Therefore the operator being available is a must have I think. Drain is a nodetool command. The node flushes data and stops ac

CQL transport (was: CQL DELETE statement)

2011-04-19 Thread Ted Zlatanov

On Tue, 19 Apr 2011 00:21:44 +0100 Courtney Robinson wrote: CR> Cool... Okay, the plan is to eventually not use thrift underneath, CR> for the CQL stuff right? Once this is done and the new transport is CR> in place, or evening while designing the new transport, is this not CR> something that's

Re: How to warm up a cold node

2011-04-19 Thread Héctor Izquierdo Seliva

Shouldn't the dynamic snitch take into account response times and ask a slow node for less requests? It seems that at node startup, only a handfull of requests arrive to the node and it keeps up well, but there's moment where there's more than it can handle with a cold cache and starts droping mess

Re: Multi-DC Deployment

2011-04-19 Thread Adrian Cockcroft

If you want to use local quorum for a distributed setup, it doesn't make sense to have less than RF=3 local and remote. Three copies at both ends will give you high availability. Only one copy of the data is sent over the wide area link (with recent versions). There is no need to use mirrored or R

40 matches

Mail list logo