Just as an example:
cassandra.thrift.address
10.12.34.56
cassandra.thrift.port
9160
cassandra.partitioner.class
org.apache.cassandra.dht.RandomPartitioner
On Apr 19, 2011, at 10:28 PM, Jeremy Hanna wrote:
> oh yeah - that's what's going on. what I do i
oh yeah - that's what's going on. what I do is on the machine that I run the
pig script from, I set the PIG_CONF variable to my HADOOP_HOME/conf directory
and in my mapred-site.xml file found there, I set the three variables.
I don't use environment variables when I run against a cluster.
On A
Did you set PIG_RPC_PORT in your hadoop-env.sh? I was seeing this error for a
while before I added that.
-Jeffrey
From: pob [mailto:peterob...@gmail.com]
Sent: Tuesday, April 19, 2011 6:42 PM
To: user@cassandra.apache.org
Subject: Re: pig + hadoop
Hey Aaron,
I read it, and all of 3 env variabl
Makes it clear! Thank you Jonathan.
On Tue, Apr 19, 2011 at 7:02 PM, Jonathan Ellis wrote:
> It doesn't make a lot of sense in general to allow those w/ non-NTS,
> but it should be possible (e.g. if you've manually interleaved nodes
> with ONTS so you know how many replicas are in each DC).
>
> P
and one more thing...
2011-04-20 04:09:23,412 INFO org.apache.hadoop.mapred.TaskTracker:
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
taskTracker/jobcache/job_201104200406_0001/attempt_201104200406_0001_m_02_0/output/file.out
in any of the configured local directories
It doesn't make a lot of sense in general to allow those w/ non-NTS,
but it should be possible (e.g. if you've manually interleaved nodes
with ONTS so you know how many replicas are in each DC).
Patch attached to https://issues.apache.org/jira/browse/CASSANDRA-2516
On Tue, Apr 19, 2011 at 8:39 PM
Thats from jobtracker:
2011-04-20 03:36:39,519 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing rack-local task task_201104200331_0002_m_00
2011-04-20 03:36:42,521 INFO org.apache.hadoop.mapred.TaskInProgress: Error
from attempt_201104200331_0002_m_00_3: java.lang.NumberFormatExcepti
If you have RF=3 in both datacenters, it could be discussed if there is a
point to use the built in replication in Cassandra at all vs. feeding the
data to both datacenters and get 2 100% isolated cassandra instances that
cannot replicate sstable corruptions between each others
My point is rea
Ah, OK. Thank you Aaron, I'll try that.
On Tue, Apr 19, 2011 at 6:39 PM, aaron morton wrote:
> You need to be using NTS.
>
> When NetworkTopologySetting is used it overrides the
> AbstractReplicationStrategy.getWriteResponseHandler() function in your stack
> and returns a either a DataCentreWri
ad2. it works with -x local , so there cant be issue with
pig->DB(Cassandra).
im using pig-0.8 from official site + hadoop-0.20.2 from offic. site.
thx
2011/4/20 aaron morton
> Am guessing but here goes. Looks like the cassandra RPC port is not set,
> did you follow these steps in contrib/pi
Hey Aaron,
I read it, and all of 3 env variables was exported. The results are same.
Best,
P
2011/4/20 aaron morton
> Am guessing but here goes. Looks like the cassandra RPC port is not set,
> did you follow these steps in contrib/pig/README.txt
>
> Finally, set the following as environment va
You need to be using NTS.
When NetworkTopologySetting is used it overrides the
AbstractReplicationStrategy.getWriteResponseHandler() function in your stack
and returns a either a DataCentreWriteResponseHandler for LOCAL_QUORUM or
DatacenterSyncWriteResponseHandler for EACH_QUORUM . They are DC
Am guessing but here goes. Looks like the cassandra RPC port is not set, did
you follow these steps in contrib/pig/README.txt
Finally, set the following as environment variables (uppercase,
underscored), or as Hadoop configuration variables (lowercase, dotted):
* PIG_RPC_PORT or cassandra.thrift.
Good point, should have read your message (and the code) more closely!
Sent from my iPhone
On Apr 19, 2011, at 9:16 PM, Oleg Tsvinev wrote:
> I'm puzzled because code does not even check for LOCAL_QUORUM before
> throwing exception.
> Indeed I did not configure NetworkTopologyStrategy. Are you
Thats what I was looking for, thanks.
At first glance the behaviour looks inconsistent, we count the number of
columns in the delete mutation. But when deleting a row the column count is
zero. I'll try to take a look later.
In the mean time you can force a memtable via JConsole, navigate down
I'm puzzled because code does not even check for LOCAL_QUORUM before
throwing exception.
Indeed I did not configure NetworkTopologyStrategy. Are you saying
that it works after configuring it?
On Tue, Apr 19, 2011 at 6:04 PM, William Oberman
wrote:
> I had a similar error today when I tried using
I had a similar error today when I tried using LOCAL_QUORUM without having a
properly configured NetworkTopologyStrategy. QUORUM worked fine however.
will
On Tue, Apr 19, 2011 at 8:52 PM, Oleg Tsvinev wrote:
> Earlier I've posted the same message to a hector-users list.
>
> Guys,
>
> I'm a bit
Earlier I've posted the same message to a hector-users list.
Guys,
I'm a bit puzzled today. I'm using just released Hector 0.7.0-29
(thank you, Nate!) and Cassandra 0.7.4 and getting the exception
below, marked as (1) Exception. When I dig to Cassandra source code
below, marked as (2) Cassandra s
Hello,
yeah, the bug was in my code because i use CL.ONE (so sometimes i
got incomplete data)
Thanks.
2011/4/14 aaron morton
> This is going to be a bug in your code, so it's a bit tricky to know but...
>
> How / when is the email added to the DB?
> What does the rawEmail function do ?
> Set a
Hello,
I did cluster configuration by
http://wiki.apache.org/cassandra/HadoopSupport. When I run
pig example-script.pig
-x local, everything is fine and i get correct results.
Problem is occurring with -x mapreduce
Im getting those errors :>
2011-04-20 01:24:21,791 [main] ERROR org.apache.pig.
El mié, 20-04-2011 a las 09:08 +1200, aaron morton escribió:
> Yes, I saw that.
>
> Wanted to know what "issue deletes through pelops" means so I can work out
> what command it's sending to cassandra and hopefully I don't waste my time
> looking in the wrong place.
>
> Aaron
>
Oh, sorry. Di
Yes, I saw that.
Wanted to know what "issue deletes through pelops" means so I can work out what
command it's sending to cassandra and hopefully I don't waste my time looking
in the wrong place.
Aaron
On 20 Apr 2011, at 09:04, Héctor Izquierdo Seliva wrote:
> I poste it a couple of messages
Can you show what comes back from calling Column.getName()
Aaron
On 20 Apr 2011, at 09:00, aaron morton wrote:
> Can you provide a little more info on what I'm seeing here. When name is
> shown for the column, are you showing me the entire byte buffer for the name
> or just up to limit ?
>
>
I poste it a couple of messages back, but here it is again:
I'm using 0.7.4. I have a file with all the row keys I have to delete
(around 100 million) and I just go through the file and issue deletes
through pelops. Should I manually issue flushes with a cron every x
time?
How do you do the deletes ?
Aaron
On 20 Apr 2011, at 08:39, Héctor Izquierdo Seliva wrote:
> El mar, 19-04-2011 a las 23:33 +0300, shimi escribió:
>> You can use memtable_flush_after_mins instead of the cron
>>
>>
>> Shimi
>>
>
> Good point! I'll try that.
>
> Wouldn't it be better to count
Can you provide a little more info on what I'm seeing here. When name is shown
for the column, are you showing me the entire byte buffer for the name or just
up to limit ?
Aaron
On 20 Apr 2011, at 05:49, Abraham Sanderson wrote:
> Ok, set up a unit test for the supercolumns which seem to have
El mar, 19-04-2011 a las 23:33 +0300, shimi escribió:
> You can use memtable_flush_after_mins instead of the cron
>
>
> Shimi
>
Good point! I'll try that.
Wouldn't it be better to count a delete as a one column operation so it
contributes to flush by operations?
> 2011/4/19 Héctor Izquierdo S
You can use memtable_flush_after_mins instead of the cron
Shimi
2011/4/19 Héctor Izquierdo Seliva
>
> El mié, 20-04-2011 a las 08:16 +1200, aaron morton escribió:
> > I think their may be an issue here, we are counting the number of columns
> in the operation. When deleting an entire row we do
El mié, 20-04-2011 a las 08:16 +1200, aaron morton escribió:
> I think their may be an issue here, we are counting the number of columns in
> the operation. When deleting an entire row we do not have a column count.
>
> Can you let us know what version you are using and how you are doing the
>
El mié, 20-04-2011 a las 07:59 +1200, aaron morton escribió:
> The dynamic snitch only reduces the chance that a node used in a read
> operation, it depends on the RF, the CL for the operation, the partitioner
> and possibly the network topology. Dropping read messages is ok, so long as
> your o
I think their may be an issue here, we are counting the number of columns in
the operation. When deleting an entire row we do not have a column count.
Can you let us know what version you are using and how you are doing the delete
?
Thanks
Aaron
On 20 Apr 2011, at 04:21, Héctor Izquierdo Sel
The dynamic snitch only reduces the chance that a node used in a read
operation, it depends on the RF, the CL for the operation, the partitioner and
possibly the network topology. Dropping read messages is ok, so long as your
operation completes at the requested CL.
Are you using either a key
Ok, set up a unit test for the supercolumns which seem to have problems, I
posted a few examples below. As I mentioned, the retrieved bytes for the
name and value appear to have additional data; in previous tests the
buffer's position, mark, and limit have been verified, and when I call
column.get
What would be the procedure in this case? Run drain on the node that is
disagreeing? But is it enough to run just drain or you suggest drain + rm
system files?
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Two-versions-of-schema-tp6277365p628786
Ok, I've read about gc grace seconds, but i'm not sure I understand it
fully. Untill gc grace seconds have passed, and there is a compaction,
the tombstones live in memory? I have to delete 100 million rows and my
insert rate is very low, so I don't have a lot of compactions. What
should I do in th
Hi everyone. I've configured in one of my column families
memtable_operations = 0.02 and started deleting keys. I have already
deleted 54k, but there hasn't been any flush of the memtable. Memory
keeps pilling up and eventually nodes start to do stop-the-world GCs. Is
this the way this is supposed
Yeah it happens from time to time even if everything seems to be fine that
schema changes don't work correctly. But it's always repairable with the
described procedure. Therefore the operator being available is a must have I
think.
Drain is a nodetool command. The node flushes data and stops ac
On Tue, 19 Apr 2011 00:21:44 +0100 Courtney Robinson wrote:
CR> Cool... Okay, the plan is to eventually not use thrift underneath,
CR> for the CQL stuff right? Once this is done and the new transport is
CR> in place, or evening while designing the new transport, is this not
CR> something that's
Shouldn't the dynamic snitch take into account response times and ask a
slow node for less requests? It seems that at node startup, only a
handfull of requests arrive to the node and it keeps up well, but
there's moment where there's more than it can handle with a cold cache
and starts droping mess
If you want to use local quorum for a distributed setup, it doesn't
make sense to have less than RF=3 local and remote. Three copies at
both ends will give you high availability. Only one copy of the data
is sent over the wide area link (with recent versions).
There is no need to use mirrored or R
40 matches
Mail list logo