date:20110125

Re: Basic question on a write operation immediately followed by a read

2011-01-25 Thread Wangpei (Peter)

for your 1-node cluster,  ANY is the only consistency level that client may 
returns BEFORE node write to memory table.
And read op on the node read both the memory table and SSTable.

It real puzzle me. :(

发件人: Roshan Dawrani [mailto:roshandawr...@gmail.com]
发送时间: 2011年1月25日 15:47
收件人: user@cassandra.apache.org; hector-us...@googlegroups.com
主题: Re: Basic question on a write operation immediately followed by a read

2011/1/25 Wangpei (Peter) 
mailto:peter.wang...@huawei.com>>
What is the ConsistencyLevel value? Is it ConsistencyLevel.ANY?

I am using Hector 0.7.0-22 and getting keyspace as HFactory.createKeyspace(), 
which seems to be defaulting the consistency level to QUORAM for both reads and 
writes.

Nowhere else, it is explicitly specified in my configuration.

Re: Basic question on a write operation immediately followed by a read

2011-01-25 Thread Roshan Dawrani

2011/1/25 Wangpei (Peter) 

>  for your 1-node cluster,  ANY is the only consistency level that client
> may returns BEFORE node write to memory table.
>
> And read op on the node read both the memory table and SSTable.
>
>
>
> It real puzzle me. :(
>

Please don't be puzzled just yet. :-)

As I said from the beginning, I wasn't confirming yet that reads were "in
fact" missing the writes. I have just observed that kind of behavior at my
app level and I wanted to understand what was the possibility of it
happening from the Cassdandra side.

If reads were sure to read what was written (with QUORAM level, let's say),
then I can look at other causes inside the app.

[mapreduce] ColumnFamilyRecordWriter hidden reuse

2011-01-25 Thread Patrik Modesto

Hi,

I play with Cassandra 0.7.0 and Hadoop, developing simple MapReduce
tasks. While developing really simple MR task, I've found that a
combiantion of Hadoop optimalization and Cassandra
ColumnFamilyRecordWriter queue creates wrong keys to send to
batch_mutate(). The proble is in the reduce part, the storage behind
the key parameter is reused. For example when storing URL I'll get:

http://119.cz/index.php/vypalovaci-mechaniky-a-vypalovani-disk/120-jak-zjistit-verzi-firmwaru-vypalovaky-ve-windows-vista
 (1)
http://11superstars.xf.cz/index.php?page=12y-a-vypalovani-disk/120-jak-zjistit-verzi-firmwaru-vypalovaky-ve-windows-vista
 (2)
http://12kmenu.unas.cz/18-6-2011-(Isachar).htmlvypalovani-disk/120-jak-zjistit-verzi-firmwaru-vypalovaky-ve-windows-vista
 (3)

You can see, that part of the URL (1) is repeating in the URL (2) and URL (3).

I've changed the my reduce method to clone the key before calling the
context.write(), but I think it should be cloned inside the Cassandra
ColumnFamilyRecordWriter because I as a user I don't care about how is
it implemented inside, I just write values there. For example the
FileOutputFormat, I don't need to clone the key when writting to it.

I'd like to know what's your opinion.

Best regards,
Patrik

Re: Schema Question

2011-01-25 Thread Andy Burgess

Aaron,

A question about one of your general points, "do not create CF's on
the fly" - what, exactly, does this mean? Do you mean named column
families, like "BlogEntries" from Sam's example, or do you mean
column family keys, like "i-got-a-new-guitar"? If it's the latter,
then could you please explain why not to do this? My application is
based around creating row keys on the fly, so I'd like to know ahead
of time if I'm creating potential trouble for myself.

To be honest, if you do mean specifically column families and not
column family keys, then I don't even understand how you would go
about creating those on-the-fly anyway. Don't they have to be
pre-configured in storage-conf.xml?

Thanks,
Andy.

On 25/01/11 00:39, Aaron Morton wrote:

Sam,
The best advice is to jump in and try any schema If you are
just starting out, start simple you're going to re-write it
several times. Worry about scale later, in most cases it's going
to work.

Some general points:

- do not create CF's on the fly.
- work out your common read requests and denormalise to
support these, the writes will be fast enough.
- try to get each read request to be resolved by reading from
a single CF (not a rule, just a guideline)
- avoid big super columns.
- this may also be interesting http://www.rackspacecloud.com/blog/2010/05/12/cassandra-by-example/

If you are happy with the one in the article start with that
and see how it works with you app. See how it works for your
read activities.

Hope that helps.
Aaron

On 25 Jan, 2011,at 12:47 PM, Sam Hodgson
wrote:

Hi all,

Im brand new to Cassandra - im migrating from MySql for a
large forum site and would be grateful if anyone can give me
some basic pointers on schema design, or any recommended
documentation.

The example used in
http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model
is very close if not exactly what I need for my main CF:

How well would this scale? Say you are storing 5 million posts and looking to scale that up
would it be better to segment them into several column families and if so to what extent?

I could create column families to store posts for each category however i'd end up with thousands of CF's.
Saying that the data would then be stored in a very sorted manner for querying/presenting.

My db is very write heavy and growing fast, Cassandra sounds like the best solution.
Any advice is greatly appreciated!!

Thanks

Sam

--
Andy Burgess
Principal Development Engineer
Application Delivery
WorldPay Ltd.
270-289 Science Park, Milton Road
Cambridge, CB4 0WE, United Kingdom (Depot Code: 024)
Office: +44 (0)1223 706 779| Mobile: +44 (0)7909 534 940
andy.burg...@worldpay.com

WorldPay (UK) Limited, Company No. 07316500. Registered Office: 55 Mansell
Street, London E1 8AN

Authorised and regulated by the Financial Services Authority.

‘WorldPay Group’ means WorldPay (UK) Limited and its affiliates from time to
time. A reference to an “affiliate” means any Subsidiary Undertaking, any
Parent Undertaking and any Subsidiary Undertaking of any such Parent
Undertaking and reference to a “Parent Undertaking” or a “Subsidiary
Undertaking” is to be construed in accordance with section 1162 of the
Companies Act 2006, as amended.

DISCLAIMER: This email and any files transmitted with it, including replies and
forwarded copies (which may contain alterations) subsequently transmitted from
the WorldPay Group, are confidential and solely for the use of the intended
recipient. If you are not the intended recipient (or authorised to receive for
the intended recipient), you have received this email in error and any review,
use, distribution or disclosure of its content is strictly prohibited. If you
have received this email in error please notify the sender immediately by
replying to this message. Please then delete this email and destroy any copies
of it.

Messages sent to and from the WorldPay Group may be monitored to ensure
compliance with internal policies and to protect our business. Emails are not
necessarily secure. The WorldPay Group does not accept responsibility for
changes made to this message after it was sent. Please note that neither the
WorldPay Group nor the sender accepts any responsibility for viruses and it is
the responsibility of the recipient to ensure that the onward transmission,
opening or use of this message and any attachments will

Re: Upgrading from 0.6 to 0.7.0

2011-01-25 Thread Daniel Josefsson

Yes, it should be possible to try.

We have not yet quite decided which way to go, think operations won't be
happy with upgrading both server and client at the same time.

Either we upgrade to 0.7.0 (currently does not look very likely), or we go
to 0.6.9 and patch with TTL. I'm not too sure what a possible future upgrade
would look like if we use the TTL patch, though.

/Daniel

2011/1/21 Aaron Morton 

> Yup, you can use diff ports and you can give them different cluster names
> and different seed lists.
>
> After you upgrade the second cluster partition the data should repair
> across, either via RR or the HHs that were stored while the first partition
> was down. Easiest thing would be to run node tool repair. Then a clean up to
> remove any leftover data.
>
> AFAIK file formats are compatible. But drain the nodes before upgrading to
> clear the log.
>
> Can you test this on a non production system?
>
> Aaron
> (we really need to write some upgrade docs:))
>
> On 21/01/2011, at 10:42 PM, Dave Gardner  wrote:
>
> What about executing writes against both clusters during the changeover?
> Interested in this topic because we're currently thinking about the same
> thing - how to upgrade to 0.7 without any interruption.
>
> Dave
>
> On 21 January 2011 09:20, Daniel Josefsson < 
> jid...@gmail.com> wrote:
>
>> No, what I'm thinking of is having two clusters (0.6 and 0.7) running on
>> different ports so they can't find each other. Or isn't that configurable?
>>
>> Then, when I have the two clusters, I could upgrade all of the clients to
>> run against the new cluster, and finally upgrade the rest of the Cassandra
>> nodes.
>>
>> I don't know how the new cluster would cope with having new data in the
>> old cluster when they are upgraded though.
>>
>> /Daniel
>>
>> 2011/1/20 Aaron Morton < aa...@thelastpickle.com
>> >
>>
>> I'm not sure if your suggesting running a mixed mode cluster there, but
>>> AFAIK the changes to the internode protocol prohibit this. The nodes will
>>> probable see each either via gossip, but the way the messages define their
>>> purpose (their verb handler) has been changed.
>>>
>>> Out of interest which is more painful, stopping the cluster and upgrading
>>> it or upgrading your client code?
>>>
>>> Aaron
>>>
>>> On 21/01/2011, at 12:35 AM, Daniel Josefsson < 
>>> jid...@gmail.com> wrote:
>>>
>>> In our case our replication factor is more than half the number of nodes
>>> in the cluster.
>>>
>>> Would it be possible to do the following:
>>>
>>>- Upgrade half of them
>>>- Change Thrift Port and inter-server port (is this the
>>>storage_port?)
>>>- Start them up
>>>- Upgrade clients one by one
>>>- Upgrade the the rest of the servers
>>>
>>> Or might we get some kind of data collision when still writing to the old
>>> cluster as the new storage is being used?
>>>
>>> /Daniel
>>>
>>>
>>
>

Re: [mapreduce] ColumnFamilyRecordWriter hidden reuse

2011-01-25 Thread Mick Semb Wever

On Tue, 2011-01-25 at 09:37 +0100, Patrik Modesto wrote:
> While developing really simple MR task, I've found that a
> combiantion of Hadoop optimalization and Cassandra
> ColumnFamilyRecordWriter queue creates wrong keys to send to
> batch_mutate(). 

I've seen similar behaviour (junk rows being written), although my keys
are always a result from
  LongSerializer.get().toByteBuffer(key)

i'm interested in looking into it - but can you provide a code example? 

  From what i can see TextOutputFormat.LineRecordWriter.write(..)
doesn't clone anything, but it does write it out immediately.
  While ColumnFamilyRecordWriter does batch the mutations up as you say,
it takes a ByteBuffer as a key, why/how are you re-using this
client-side (arn't you creating a new ByteBuffer each call to
write(..))?

~mck

-- 
"Never let your sense of morals get in the way of doing what's right."
Isaac Asimov 
| http://semb.wever.org | http://sesat.no
| http://finn.no   | Java XSS Filter

signature.asc
Description: This is a digitally signed message part

Re: [mapreduce] ColumnFamilyRecordWriter hidden reuse

2011-01-25 Thread Patrik Modesto

Hi Mick,

attached is the very simple MR job, that deletes expired URL from my
test Cassandra DB. The keyspace looks like this:

Keyspace: Test:
  Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
Replication Factor: 2
  Column Families:
ColumnFamily: Url2
  Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
  Row cache size / save period: 0.0/0
  Key cache size / save period: 20.0/3600
  Memtable thresholds: 4.7015625/1003/60
  GC grace seconds: 864000
  Compaction min/max thresholds: 4/32
  Read repair chance: 1.0
  Built indexes: []

In the CF the key is URL and inside there are some data. My MR job
needs just "expire_date" which is int64 timestamp. For now I store it
as a string because I use Python and C++ to manipulate the data as
well.

For the MR Job to run you need a patch I did. You can find it here:
https://issues.apache.org/jira/browse/CASSANDRA-2014

The atttached file contains the working version with cloned key in
reduce() method. My other aproache was:
[code]
context.write(ByteBuffer.wrap(key.getBytes(), 0, key.getLength()),
Collections.singletonList(getMutation(key)));
[/code]
Which produce junk keys.

Best regards,
Patrik

import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.nio.LongBuffer;
import java.util.*;

import org.apache.cassandra.avro.Mutation;
import org.apache.cassandra.avro.Deletion;
import org.apache.cassandra.avro.SliceRange;
import org.apache.cassandra.hadoop.ColumnFamilyOutputFormat;

import org.apache.cassandra.db.IColumn;
import org.apache.cassandra.hadoop.ColumnFamilyInputFormat;
import org.apache.cassandra.hadoop.ConfigHelper;
import org.apache.cassandra.thrift.SlicePredicate;
import org.apache.cassandra.utils.ByteBufferUtil;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class ContextExpirator extends Configured implements Tool
{
static final String KEYSPACE = "Test";
static final String COLUMN_FAMILY = "Url2";
static final String OUTPUT_COLUMN_FAMILY = "Url2";
static final String COLUMN_VALUE = "expire_date";

public static void main(String[] args) throws Exception
{
// Let ToolRunner handle generic command-line options
ToolRunner.run(new Configuration(), new ContextExpirator(), args);
System.exit(0);
}

public static class UrlFilterMapper
extends Mapper, Text, NullWritable>
{
private final static NullWritable nic = NullWritable.get();
private ByteBuffer sourceColumn;
private static long now;

protected void setup(org.apache.hadoop.mapreduce.Mapper.Context context)
throws IOException, InterruptedException
{
sourceColumn = ByteBuffer.wrap(COLUMN_VALUE.getBytes());
now = System.currentTimeMillis() / 1000; // convert from ms
}

public void map(ByteBuffer key, SortedMap columns, Context context)
throws IOException, InterruptedException
{
IColumn column = columns.get(sourceColumn);
if (column == null) {
return;
}

Text tKey = new Text(ByteBufferUtil.string(key));
Long value = Long.decode(ByteBufferUtil.string(column.value()));

if(now > value) {
context.write(tKey, nic);
}
}
}

public static class RemoveUrlReducer
extends Reducer>
{
public void reduce(Text key, Iterable values, Context context)
throws IOException, InterruptedException
{
ByteBuffer bbKey = ByteBufferUtil.clone(ByteBuffer.wrap(key.getBytes(), 0, key.getLength()));
context.write(bbKey, Collections.singletonList(getMutation()));
}

private static Mutation getMutation()
{
Deletion d = new Deletion();
d.timestamp = System.currentTimeMillis();

Mutation m = new Mutation();
m.deletion = d;

return m;
}
}

public int run(String[] args) throws Exception
{
Job job = new Job(getConf(), "context_expitator");
job.setJarByClass(ContextExpirator.class);

job.setInputFormatClass(ColumnFamilyInputFormat.class);
ConfigHelper.setInputColumnFamily(job.getConfiguration(), KEYSPACE, COLUMN_FAMILY);

job.setMapperClass(UrlFilterMapper.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(NullWritable.class);

job.setReducerClass(RemoveUrlReducer.class);
job.setOutputKeyClass(ByteBuffer.class);
job.setOutputValueClass(List.cla

Re: Schema Question

2011-01-25 Thread David McNelis

I'm fairly certain Aaron is referring to named families like BlogEntries,
not named columns (i-got-a-new-guitar).

On Tue, Jan 25, 2011 at 4:37 AM, Andy Burgess
wrote:

>  Aaron,
>
> A question about one of your general points, "do not create CF's on the
> fly" - what, exactly, does this mean? Do you mean named column families,
> like "BlogEntries" from Sam's example, or do you mean column family keys,
> like "i-got-a-new-guitar"? If it's the latter, then could you please explain
> why not to do this? My application is based around creating row keys on the
> fly, so I'd like to know ahead of time if I'm creating potential trouble for
> myself.
>
> To be honest, if you do mean specifically column families and not column
> family keys, then I don't even understand how you would go about creating
> those on-the-fly anyway. Don't they have to be pre-configured in
> storage-conf.xml?
>
> Thanks,
> Andy.
>
>
> On 25/01/11 00:39, Aaron Morton wrote:
>
> Sam,
> The best advice is to jump in and try any schema If you are just starting
> out, start simple you're going to re-write it several times. Worry about
> scale later, in most cases it's going to work.
>
>  Some general points:
>
>  - do not create CF's on the fly.
> - work out your common read requests and denormalise to support these, the
> writes will be fast enough.
> - try to get each read request to be resolved by reading from a single CF
> (not a rule, just a guideline)
> - avoid big super columns.
> - this may also be interesting
> http://www.rackspacecloud.com/blog/2010/05/12/cassandra-by-example/
>
>   If you are happy with the one in the article start with that and see how
> it works with you app. See how it works for your read activities.
>
>  Hope that helps.
> Aaron
>
>
> On 25 Jan, 2011,at 12:47 PM, Sam Hodgson 
> wrote:
>
>   Hi all,
>
> Im brand new to Cassandra - im migrating from MySql for a large forum site
> and would be grateful if anyone can give me some basic pointers on schema
> design, or any recommended documentation.
>
> The example used in
> http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model is very
> close if not exactly what I need for my main CF:
>
> 
> How well would this scale? Say you are storing 5 million posts and looking to 
> scale that up
> would it be better to segment them into several column families and if so to 
> what extent?
>
> I could create column families to store posts for each category however i'd 
> end up with thousands of CF's.
> Saying that the data would then be stored in a very sorted manner for 
> querying/presenting.
>
> My db is very write heavy and growing fast, Cassandra sounds like the best 
> solution.Any advice is greatly appreciated!!
>
> Thanks
>
> Sam
>
>
>
> --
> Andy Burgess
> Principal Development Engineer
> Application Delivery
> WorldPay Ltd.
> 270-289 Science Park, Milton Road
> Cambridge, CB4 0WE, United Kingdom (Depot Code: 024)
> Office: +44 (0)1223 706 779| Mobile: +44 (0)7909 534 
> 940andy.burg...@worldpay.com
>
>
> WorldPay (UK) Limited, Company No. 07316500. Registered Office: 55 Mansell
> Street, London E1 8AN
>
> Authorised and regulated by the Financial Services Authority.
>
> ‘WorldPay Group’ means WorldPay (UK) Limited and its affiliates from time
> to time.  A reference to an “affiliate” means any Subsidiary Undertaking,
> any Parent Undertaking and any Subsidiary Undertaking of any such Parent
> Undertaking and reference to a “Parent Undertaking” or a “Subsidiary
> Undertaking” is to be construed in accordance with section 1162 of the
> Companies Act 2006, as amended.
>
> DISCLAIMER: This email and any files transmitted with it, including replies
> and forwarded copies (which may contain alterations) subsequently
> transmitted from the WorldPay Group, are confidential and solely for the use
> of the intended recipient. If you are not the intended recipient (or
> authorised to receive for the intended recipient), you have received this
> email in error and any review, use, distribution or disclosure of its
> content is strictly prohibited. If you have received this email in error
> please notify the sender immediately by replying to this message. Please
> then delete this email and destroy any copies of it.
>
> Messages sent to and from the WorldPay Group may be monitored to ensure
> compliance with internal policies and to protect our business.  Emails are
> not necessarily secure.  The WorldPay Group does not accept responsibility
> for changes made to this message after it was sent. Please note that neither
> the WorldPay Group nor the sender accepts any responsibility for viruses and
> it is the responsibility of the recipient to ensure that the onward
> transmission, opening or use of this message and any attachments will not
> adversely affect its systems or data. Anyone who communicates with us by
> email is taken to accept these risks. Opinions, conclusions and other
> information contained in this message that do not relate to the official
> business of

Re: client threads locked up - JIRA ISSUE 1594

2011-01-25 Thread Nate McCall

What version of the Thrift API are you using?

(In general, you should use an existing client library rather than
rolling your own - I recommend Hector:
https://github.com/rantav/hector).

On Tue, Jan 25, 2011 at 12:38 AM, Arijit Mukherjee  wrote:
> I'm using Cassandra 0.6.8. I'm not using Hector - it's just raw thrift APIs.
>
> Arijit
>
> On 21 January 2011 22:13, Nate McCall  wrote:
>> What versions of Cassandra and Hector? The versions mentioned on this
>> ticket are both several releases behind.
>>
>> On Fri, Jan 21, 2011 at 3:53 AM, Arijit Mukherjee  wrote:
>>> Hi All
>>>
>>> I'm facing the same issue as this one mentioned here -
>>> https://issues.apache.org/jira/browse/CASSANDRA-1594
>>>
>>> Is there any solution or work-around for this?
>>>
>>> Regards
>>> Arijit
>>>
>>>
>>> --
>>> "And when the night is cloudy,
>>> There is still a light that shines on me,
>>> Shine on until tomorrow, let it be."
>>>
>>
>
>
>
> --
> "And when the night is cloudy,
> There is still a light that shines on me,
> Shine on until tomorrow, let it be."
>

Re: Stress test inconsistencies

2011-01-25 Thread Tyler Hobbs

Try using something higher than -t 1, like -t 100.

- Tyler

On Mon, Jan 24, 2011 at 9:38 PM, Oleg Proudnikov wrote:

> Hi All,
>
> I am struggling to make sense of a simple stress test I ran against the
> latest
> Cassandra 0.7. My server performs very poorly compared to a desktop and
> even a
> notebook.
>
> Here is the command I execute - a single threaded insert that runs on the
> same
> host as Cassnadra does (I am using new contrib/stress but old py_stress
> produces
> similar results):
>
> ./stress -t 1 -o INSERT -c 30 -n 1 -i 1
>
> On a SUSE Linux server with a 4-core Intel XEON I get maximum 30 inserts a
> second with 40ms latency. But on a Windows desktop I get incredible 200-260
> inserts a second with a 4ms latency!!! Even on the smallest MacBook Pro I
> get
> bursts of high throughput - 100+ inserts a second.
>
> Could you please help me figure out what is wrong with my server? I tried
> several servers actually with the same results. I would appreciate any help
> in
> tracing down the bottleneck. Configuration is the same in all tests with
> the
> server having the advantage of separate physical disks for commitlog and
> data.
>
> Could you also share with me what numbers you get or what is reasonable to
> expect from this test?
>
> Thank you very much,
> Oleg
>
>
> Here is the output for the Linux server, Windows desktop and MacBook Pro,
> one
> line per second:
>
> Linux server - INtel XEON X3330 @ 2.666Mhz, 4G RAM, 2G heap
>
> Created keyspaces. Sleeping 1s for propagation.
> total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
> 19,19,19,0.05947368421052632,1
> 46,27,27,0.04274074074074074,2
> 70,24,24,0.04733,3
> 95,25,25,0.04696,4
> 119,24,24,0.048208333,5
> 147,28,28,0.04189285714285714,7
> 177,30,30,0.03904,8
> 206,29,29,0.04006896551724138,9
> 235,29,29,0.03903448275862069,10
>
> Windows desktop: Core2 Duo CPU E6550 @ 2.333Mhz, 2G RAM, 1G heap
>
> Keyspace already exists.
> total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
> 147,147,147,0.005292517006802721,1
> 351,204,204,0.0042009803921568625,2
> 527,176,176,0.006551136363636364,3
> 718,191,191,0.005617801047120419,4
> 980,262,262,0.00400763358778626,5
> 1206,226,226,0.004150442477876107,6
> 1416,210,210,0.005619047619047619,7
> 1678,262,262,0.0040038167938931295,8
>
> MacBook Pro: Core2 Duo CPU @ 2.26Mhz, 2G RAM, 1G heap
>
> Created keyspaces. Sleeping 1s for propagation.
> total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
> 0,0,0,NaN,1
> 7,7,7,0.21185714285714285,2
> 47,40,40,0.026925,3
> 171,124,124,0.007967741935483871,4
> 258,87,87,0.01206896551724138,6
> 294,36,36,0.022444,7
> 303,9,9,0.14378,8
> 307,4,4,0.2455,9
> 313,6,6,0.128,10
> 508,195,195,0.007938461538461538,11
> 792,284,284,0.0035985915492957746,12
> 882,90,90,0.01219,13
>
>
>
>

Re: [mapreduce] ColumnFamilyRecordWriter hidden reuse

2011-01-25 Thread Mick Semb Wever

On Tue, 2011-01-25 at 14:16 +0100, Patrik Modesto wrote:
> The atttached file contains the working version with cloned key in
> reduce() method. My other aproache was:
> 
> > context.write(ByteBuffer.wrap(key.getBytes(), 0, key.getLength()),
> > Collections.singletonList(getMutation(key)));
> 
> Which produce junk keys. 

In fact i have another problem (trying to write an empty byte[], or
something, as a key, which put one whole row out of whack, ((one row in
25 million...))).

But i'm debugging along the same code.

I don't quite understand how the byte[] in 
ByteBuffer.wrap(key.getBytes(),...)
gets clobbered.
Well your key is a mutable Text object, so i can see some possibility
depending on how hadoop uses these objects.
Is there something to ByteBuffer.allocate(..) i'm missing...

btw.
 is "d.timestamp = System.currentTimeMillis();" ok?
 shouldn't this be microseconds so that each mutation has a different
timestamp? http://wiki.apache.org/cassandra/DataModel

~mck

-- 
"As you go the way of life, you will see a great chasm. Jump. It is not
as wide as you think." Native American Initiation Rite 
| http://semb.wever.org | http://sesat.no
| http://finn.no   | Java XSS Filter

-- 
"Everything should be made as simple as possible, but not simpler."
Albert Einstein (William of Ockham) 
| http://semb.wever.org | http://sesat.no
| http://finn.no   | Java XSS Filter

signature.asc
Description: This is a digitally signed message part

Re: Forcing GC w/o jconsole

2011-01-25 Thread buddhasystem


Thanks! It doesn't seem to have any effect on GCing dropped CFs, though.

Maxim

-- 
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Forcing-GC-w-o-jconsole-tp5956747p5960100.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Stress test inconsistencies

2011-01-25 Thread Oleg Proudnikov

Tyler Hobbs  riptano.com> writes:

> Try using something higher than -t 1, like -t 100.- Tyler
>

Thank you, Tyler!

When I run contrib/stress with a higher thread count, the server does scale to
200 inserts a second with latency of 200ms. At the same time Windows desktop
scales to 900 inserts a second and latency of 120ms. There is a huge difference
that I am trying to understand and eliminate.

In my real life bulk load I have to stay with a single threaded client for the
POC I am doing. The only option I have is to run several client processes... My
real life load is heavier than what contrib/stress does. It takes several days
to bulk load 4 million batch mutations !!! It is really painful :-( Something is
just not right...

Oleg

Re: Stress test inconsistencies

2011-01-25 Thread buddhasystem


Oleg,

I'm a novice at this, but for what it's worth I can't imagine you can have a
_sustained_ 1kHz insertion rate on a single machine which also does some
reads. If I'm wrong, I'll be glad to learn that I was. It just doesn't seem
to square with a typical seek time on a hard drive.

Maxim

-- 
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Stress-test-inconsistencies-tp5957467p5960182.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Stress test inconsistencies

2011-01-25 Thread Brandon Williams

On Tue, Jan 25, 2011 at 1:23 PM, Oleg Proudnikov wrote:

> When I run contrib/stress with a higher thread count, the server does scale
> to
> 200 inserts a second with latency of 200ms. At the same time Windows
> desktop
> scales to 900 inserts a second and latency of 120ms. There is a huge
> difference
> that I am trying to understand and eliminate.
>

Those are really low numbers, are you still testing with 10k rows?  That's
not enough, try 1M to give both JVMs enough time to warm up.

-Brandon

Files not deleted after compaction and GCed

2011-01-25 Thread Ching-Cheng Chen

Using cassandra 0.7.0

The class org.apache.cassandra.io.sstable.SSTableDeletingReference only
remove the -Data.db file, but leave the xxx-Compacted, xxx-Filter.db,
xxx-Index.db and xxx-Statistics.db intact.

And that's the behavior I saw.I ran manual compact then trigger a GC
from jconsole.   The Data.db file got removed but not the others.

Is this the expected behavior?

Regards,

Chen

Re: Stress test inconsistencies

2011-01-25 Thread Oleg Proudnikov

Brandon Williams  gmail.com> writes:

> 
> On Tue, Jan 25, 2011 at 1:23 PM, Oleg Proudnikov  cloudorange.com>
wrote:
> 
> When I run contrib/stress with a higher thread count, the server does scale to
> 200 inserts a second with latency of 200ms. At the same time Windows desktop
> scales to 900 inserts a second and latency of 120ms. There is a huge 
> difference
> that I am trying to understand and eliminate.
> 
> 
> Those are really low numbers, are you still testing with 10k rows?  That's not
enough, try 1M to give both JVMs enough time to warm up.
> 
> 
> -Brandon 
> 

I agree, Brandon, the numbers are very low! The warm up does not seem to make
any difference though... There is something that is holding the server back
because the CPU is very low. I am trying to understand where this bottleneck is
on the Linux server. I do not think it is Cassandra's config as I use the same
config on Windows and get much higher numbers as I described.

Oleg

Re: Errors During Compaction

2011-01-25 Thread Aaron Morton

Dan how did you go with this? More joy, less joy or a continuation of the 
current level of joy?

Aaron


On 24/01/2011, at 9:38 AM, Dan Hendry  wrote:

> I have run into a strange problem and was hoping for suggestions on how to 
> fix it (0.7.0). When compaction occurs on one node for what appears to be one 
> specific column family, the following error pops up the Cassandra log. 
> Compaction apparently fails and temp files don’t get cleaned up. After a 
> while and what seems to be multiple failed compactions on the CF, the node 
> runs out of disk space and crashes. Not sure if it is a related problem or a 
> function of this being a heavily used column family but after failing to 
> compact, compaction restarts on the same CF exacerbating the issue.
> 
>  
> 
> Problems with this specific node started earlier this weekend when it crashed 
> with and OOM error. This is quite surprising since my memtable thresholds and 
> GC settings have been tuned to run with quite a bit of overhead during normal 
> operation (max heap usage usually <= 10 GB on a 12 GB heap, average usage of 
> 6-8 GB). I could not find anything abnormal in the logs which would prompt an 
> OOM.
> 
>  
> 
> I will look things over tomorrow and try to provide a bit more information on 
> the problem but as a solution, I was going to wipe out all SSTables for this 
> CF on this node and then run a repair. Far from ideal, is this a reasonable 
> solution?
> 
>  
> 
>  
> 
> ERROR [CompactionExecutor:1] 2011-01-23 14:10:29,855 
> AbstractCassandraDaemon.java (line 91) Fatal exception in thread 
> Thread[CompactionExecutor:1,1,RMI Runtime]
> 
> java.io.IOError: java.io.EOFException: attempted to skip -1983579368 bytes 
> but only skipped 0
> 
> at 
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:78)
> 
> at 
> org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:178)
> 
> at 
> org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:143)
> 
> at 
> org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:135)
> 
> at 
> org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:38)
> 
> at 
> org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284)
> 
> at 
> org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)
> 
> at 
> org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
> 
> at 
> org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:68)
> 
> at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
> 
> at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
> 
> at 
> org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183)
> 
> at 
> org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94)
> 
> at 
> org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:323)
> 
> at 
> org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:122)
> 
> at 
> org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:92)
> 
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> 
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> 
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> 
> at java.lang.Thread.run(Thread.java:662)
> 
> Caused by: java.io.EOFException: attempted to skip -1983579368 bytes but only 
> skipped 0
> 
> at 
> org.apache.cassandra.io.sstable.IndexHelper.skipBloomFilter(IndexHelper.java:52)
> 
> at 
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:69)
> 
> ... 20 more
> 
>  
> 
> Dan Hendry
> 
> (403) 660-2297
> 
>

Re: Stress test inconsistencies

2011-01-25 Thread Oleg Proudnikov

buddhasystem  bnl.gov> writes:

> 
> 
> Oleg,
> 
> I'm a novice at this, but for what it's worth I can't imagine you can have a
> _sustained_ 1kHz insertion rate on a single machine which also does some
> reads. If I'm wrong, I'll be glad to learn that I was. It just doesn't seem
> to square with a typical seek time on a hard drive.
> 
> Maxim
> 

Maxim,

As I understand during inserts Cassandra should not be constrained by random
seek time as it uses sequential writes. I do get high numbers on Windows but
there is something that is holding back my Linux server. I am trying to
understand what it is.

Oleg

Re: Files not deleted after compaction and GCed

2011-01-25 Thread Jonathan Ellis

No, that is not expected.  All the sstable components are removed in
the same method; did you check the log for exceptions?

On Tue, Jan 25, 2011 at 2:58 PM, Ching-Cheng Chen
 wrote:
> Using cassandra 0.7.0
> The class org.apache.cassandra.io.sstable.SSTableDeletingReference only
> remove the -Data.db file, but leave the xxx-Compacted, xxx-Filter.db,
> xxx-Index.db and xxx-Statistics.db intact.
> And that's the behavior I saw.    I ran manual compact then trigger a GC
> from jconsole.   The Data.db file got removed but not the others.
> Is this the expected behavior?
> Regards,
> Chen



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Files not deleted after compaction and GCed

2011-01-25 Thread Ching-Cheng Chen

Nope, no exception at all.

But if the same class
(org.apache.cassandra.io.sstable.SSTableDeletingReference) is responsible
for delete other files, then that's not right.
I checked the source code for SSTableDeletingReference, doesn't looks like
it will delete other files type.

Regards,

Chen

On Tue, Jan 25, 2011 at 4:05 PM, Jonathan Ellis  wrote:

> No, that is not expected.  All the sstable components are removed in
> the same method; did you check the log for exceptions?
>
> On Tue, Jan 25, 2011 at 2:58 PM, Ching-Cheng Chen
>  wrote:
> > Using cassandra 0.7.0
> > The class org.apache.cassandra.io.sstable.SSTableDeletingReference only
> > remove the -Data.db file, but leave the xxx-Compacted, xxx-Filter.db,
> > xxx-Index.db and xxx-Statistics.db intact.
> > And that's the behavior I saw.I ran manual compact then trigger a GC
> > from jconsole.   The Data.db file got removed but not the others.
> > Is this the expected behavior?
> > Regards,
> > Chen
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>

Re: Does Major Compaction work on dropped CFs? Doesn't seem so.

2011-01-25 Thread Aaron Morton

You can run JConsole on your workstation and connect remotely to the nodes, it does not need to be run on the node itself. Connecting is discussed here http://wiki.apache.org/cassandra/MemtableThresholds and some help for connecting is here http://wiki.apache.org/cassandra/JmxGotchasThere is also a Web front end for the JMX service http://wiki.apache.org/cassandra/Operations#Monitoring_with_MX4JAnd a recent discussion on different way to monitor a node http://www.mail-archive.com/user@cassandra.apache.org/msg08100.html If you did through there is some talk about a JMX<>REST bridge. Hope that helps.AaronOn 25 Jan, 2011,at 04:17 PM, buddhasystem  wrote:
Thanks Aaron. As I remarked earlier (and it seems it not uncommon) none of
the nodes have X11 installed (I think I could arrange this, but it's a bit
of a hassle). So if I understand correctly, jconsole is a X11 app, and I'm
out of luck with that.

I would agree with you that having a proper nodetool command to zap the data
you know you don't need, would be quite ideal. The reason I'm so retentive
about it is that I plan to test scaling up to 250 million rows, and disk
space matters.
-- 
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Does-Major-Compaction-work-on-dropped-CFs-Doesn-t-seem-so-tp5946031p5957426.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.

Re: Files not deleted after compaction and GCed

2011-01-25 Thread Jonathan Ellis

the other component types are deleted by this line:

SSTable.delete(desc, components);

On Tue, Jan 25, 2011 at 3:11 PM, Ching-Cheng Chen
 wrote:
> Nope, no exception at all.
> But if the same class
> (org.apache.cassandra.io.sstable.SSTableDeletingReference) is responsible
> for delete other files, then that's not right.
> I checked the source code for SSTableDeletingReference, doesn't looks like
> it will delete other files type.
> Regards,
> Chen
>
> On Tue, Jan 25, 2011 at 4:05 PM, Jonathan Ellis  wrote:
>>
>> No, that is not expected.  All the sstable components are removed in
>> the same method; did you check the log for exceptions?
>>
>> On Tue, Jan 25, 2011 at 2:58 PM, Ching-Cheng Chen
>>  wrote:
>> > Using cassandra 0.7.0
>> > The class org.apache.cassandra.io.sstable.SSTableDeletingReference only
>> > remove the -Data.db file, but leave the xxx-Compacted,
>> > xxx-Filter.db,
>> > xxx-Index.db and xxx-Statistics.db intact.
>> > And that's the behavior I saw.    I ran manual compact then trigger a GC
>> > from jconsole.   The Data.db file got removed but not the others.
>> > Is this the expected behavior?
>> > Regards,
>> > Chen
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Schema Question

2011-01-25 Thread Aaron Morton

Yeah, I was talking about create a ColumnFamily definition via the API. Not inserting data into an already defined column family. The recommened approach to creating your schema is via the build in bin/cassandra-cli command line tool. It has loads of build in help and here is an example of how to create a keyspace http://www.mail-archive.com/user@cassandra.apache.org/msg09146.htmlLet me know how you get on. AaronOn 26 Jan, 2011,at 02:28 AM, David McNelis wrote:I'm fairly certain Aaron is referring to named families like BlogEntries, not named columns (i-got-a-new-guitar). On Tue, Jan 25, 2011 at 4:37 AM, Andy Burgess wrote:

Aaron,

Thanks,
Andy.

On 25/01/11 00:39, Aaron Morton wrote:

Sam,
The best advice is to jump in and try any schema If you are
just starting out, start simple you're going to re-write it
several times. Worry about scale later, in most cases it's going
to work.

Some general points:

If you are happy with the one in the article start with that
and see how it works with you app. See how it works for your
read activities.

Hope that helps.
Aaron

On 25 Jan, 2011,at 12:47 PM, Sam Hodgson
wrote:

Hi all,

Im brand new to Cassandra - im migrating from MySql for a
large forum site and would be grateful if anyone can give me
some basic pointers on schema design, or any recommended
documentation.

The example used in
http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model
is very close if not exactly what I need for my main CF:

How well would this scale? Say you are storing 5 million posts and looking to scale that up
would it be better to segment them into several column families and if so to what extent?

I could create column families to store posts for each category however i'd end up with thousands of CF's.
Saying that the data would then be stored in a very sorted manner for querying/presenting.

My db is very write heavy and growing fast, Cassandra sounds like the best solution.
Any advice is greatly appreciated!!

Thanks

Sam

WorldPay (UK) Limited, Company No. 07316500. Registered Office: 55 Mansell Street, London E1 8AN

Authorised and regulated by the Financial Services Authority.

WorldPay Group means WorldPay (UK) Limited and its affiliates from time to time. A reference to an affiliate means any Subsidiary Undertaking, any Parent Undertaking and any Subsidiary Undertaking of any such Parent Undertaking and reference to a Parent Undertaking or a Subsidiary Undertaking is to be construed in accordance with section 1162 of the Companies Act 2006, as amended.

DISCLAIMER: This email and any files transmitted with it, including replies and forwarded copies (which may contain alterations) subsequently transmitted from the WorldPay Group, are confidential and solely for the use of the intended recipient. If you are not the intended recipient (or authorised to receive for the intended recipient), you have received this email in error and any review, use, distribution or disclosure of its content is strictly prohibited. If you have received this email in error ple

Re: Stress test inconsistencies

2011-01-25 Thread Anthony John

Look at iostat -x 10 10 when he active par tof your test is running. there
should be something called svc_t - that should be in the 10ms range, and
await should be low.

Will tell you if IO is slow, or if IO is not being issued.

Also, ensure that you ain't swapping with something like "swapon -s"

On Tue, Jan 25, 2011 at 3:04 PM, Oleg Proudnikov wrote:

> buddhasystem  bnl.gov> writes:
>
> >
> >
> > Oleg,
> >
> > I'm a novice at this, but for what it's worth I can't imagine you can
> have a
> > _sustained_ 1kHz insertion rate on a single machine which also does some
> > reads. If I'm wrong, I'll be glad to learn that I was. It just doesn't
> seem
> > to square with a typical seek time on a hard drive.
> >
> > Maxim
> >
>
> Maxim,
>
> As I understand during inserts Cassandra should not be constrained by
> random
> seek time as it uses sequential writes. I do get high numbers on Windows
> but
> there is something that is holding back my Linux server. I am trying to
> understand what it is.
>
> Oleg
>
>
>
>

Re-partitioning the cluster with nodetool: what's happening?

2011-01-25 Thread buddhasystem


I'm trying re-partition my 4-node cluster to make the load exactly 25% on
each node.
As per recipes found in documentation, I calculate:
>>> for x in xrange(4):
... print 2**127/4*x
...
0
42535295865117307932921825928971026432
85070591730234615865843651857942052864
127605887595351923798765477786913079296

And I need to move the first one to 0, then the second one to
42535295865117307932921825928971026432 etc.

Once I start the procedure, I see no progress when I look at nodetool
netstats. Nothing's happening. What am I doing wrong?

Thanks,

Maxim

-- 
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-partitioning-the-cluster-with-nodetool-what-s-happening-tp5960843p5960843.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Re-partitioning the cluster with nodetool: what's happening?

2011-01-25 Thread buddhasystem


Correction -- what I meant to say that I do see announcements about streaming
in the output, but these are stuck at 0%.

-- 
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-partitioning-the-cluster-with-nodetool-what-s-happening-tp5960843p5960851.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

get_range_slices getting deleted rows

2011-01-25 Thread Nick Santini

Hi,
I'm trying a test scenario where I create 100 rows in a CF, then
use get_range_slices to get all the rows, and I get 100 rows, so far so good
then after the test I delete the rows using "remove" but without a column or
super column, this deletes the row, I can confirm that cos if I try to get
it with get_slice using the key I get nothing

but then if I do get_range_slice again, where the range goes between new
byte[0] and new byte[0] (therefore returning everything), I still get the
100 row keys

is that expected to be?

thanks

Nicolas Santini

Re: get_range_slices getting deleted rows

2011-01-25 Thread Narendra Sharma

Yes. See this http://wiki.apache.org/cassandra/FAQ#range_ghosts

-Naren

On Tue, Jan 25, 2011 at 2:59 PM, Nick Santini wrote:

> Hi,
> I'm trying a test scenario where I create 100 rows in a CF, then
> use get_range_slices to get all the rows, and I get 100 rows, so far so good
> then after the test I delete the rows using "remove" but without a column
> or super column, this deletes the row, I can confirm that cos if I try to
> get it with get_slice using the key I get nothing
>
> but then if I do get_range_slice again, where the range goes between new
> byte[0] and new byte[0] (therefore returning everything), I still get the
> 100 row keys
>
> is that expected to be?
>
> thanks
>
> Nicolas Santini
>

Fwd: CFP - Berlin Buzzwords 2011 - Search, Score, Scale

2011-01-25 Thread David G. Boney

This might interest the Cassandra community.
-
Sincerely,
David G. Boney
dbon...@semanticartifacts.com
http://www.semanticartifacts.com




Begin forwarded message:

> From: Isabel Drost 
> Date: January 25, 2011 2:53:28 PM CST
> To: u...@mahout.apache.org
> Cc: gene...@lucene.apache.org, gene...@hadoop.apache.org, 
> u...@hbase.apache.org, solr-u...@lucene.apache.org, 
> java-u...@lucene.apache.org, u...@nutch.apache.org
> Subject: CFP - Berlin Buzzwords 2011 - Search, Score, Scale
> Reply-To: u...@mahout.apache.org
> Reply-To: isa...@apache.org
> 
> This is to announce the Berlin Buzzwords 2011. The second edition of the 
> successful conference on scalable and open search, data processing and data 
> storage in Germany, taking place in Berlin.
> 
> Call for Presentations Berlin Buzzwords
>http://berlinbuzzwords.de
>   Berlin Buzzwords 2011 - Search, Store, Scale
> 6/7 June 2011
> The event will comprise presentations on scalable data processing. We invite 
> you to submit talks on the topics:
>* IR / Search - Lucene, Solr, katta or comparable solutions
>* NoSQL - like CouchDB, MongoDB, Jackrabbit, HBase and others
>* Hadoop - Hadoop itself, MapReduce, Cascading or Pig and relatives
>* Closely related topics not explicitly listed above are welcome. We are
>  looking for presentations on the implementation of the systems 
> themselves,
>  real world applications and case studies.
> 
> Important Dates (all dates in GMT +2)
>* Submission deadline: March 1st 2011, 23:59 MEZ
>* Notification of accepted speakers: March 22th, 2011, MEZ.
>* Publication of final schedule: April 5th, 2011.
>* Conference: June 6/7. 2011
> High quality, technical submissions are called for, ranging from principles 
> to practice. We are looking for real world use cases, background on the 
> architecture of specific projects and a deep dive into architectures built on 
> top of e.g. Hadoop clusters.
> 
> Proposals should be submitted at http://berlinbuzzwords.de/content/cfp-0 no 
> later than March 1st, 2011. Acceptance notifications will be sent out soon 
> after the submission deadline. Please include your name, bio and email, the 
> title of the talk, a brief abstract in English language. Please indicate 
> whether you want to give a lightning (10min), short (20min) or long (40min) 
> presentation and indicate the level of experience with the topic your 
> audience should have (e.g. whether your talk will be suitable for newbies or 
> is targeted for experienced users.) If you'd like to pitch your brand new 
> product in your talk, please let us know as well - there will be extra space 
> for presenting new ideas, awesome products and great new projects.
> 
> The presentation format is short. We will be enforcing the schedule 
> rigorously.
> 
> If you are interested in sponsoring the event (e.g. we would be happy to 
> provide videos after the event, free drinks for attendees as well as an 
> after-show party), please contact us.
> 
> Follow @hadoopberlin on Twitter for updates. Tickets, news on the conference, 
> and the final schedule are be published at http://berlinbuzzwords.de.
> 
> Program Chairs: Isabel Drost, Jan Lehnardt, and Simon Willnauer.
> Please re-distribute this CfP to people who might be interested.
> If you are local and wish to meet us earlier, please note that this Thursday 
> evening there will be an Apache Hadoop Get Together (videos kindly sponsored 
> by Cloudera, venue kindly provided for free by Zanox) featuring talks on 
> Apache Hadoop in production as well as news on current Apache Lucene 
> developments.
> 
> Contact us at:
> 
> newthinking communications 
> GmbH Schönhauser Allee 6/7 
> 10119 Berlin, 
> Germany 
> Julia Gemählich
> Isabel Drost 
> +49(0)30-9210 596

Re: get_range_slices getting deleted rows

2011-01-25 Thread Nick Santini

thanks,
so I need to check the returned slice for the key to verify that is a valid
row and not a deleted one?

Nicolas Santini



On Wed, Jan 26, 2011 at 12:16 PM, Narendra Sharma  wrote:

> Yes. See this http://wiki.apache.org/cassandra/FAQ#range_ghosts
>
> -Naren
>
>
> On Tue, Jan 25, 2011 at 2:59 PM, Nick Santini wrote:
>
>> Hi,
>> I'm trying a test scenario where I create 100 rows in a CF, then
>> use get_range_slices to get all the rows, and I get 100 rows, so far so good
>> then after the test I delete the rows using "remove" but without a column
>> or super column, this deletes the row, I can confirm that cos if I try to
>> get it with get_slice using the key I get nothing
>>
>> but then if I do get_range_slice again, where the range goes between new
>> byte[0] and new byte[0] (therefore returning everything), I still get the
>> 100 row keys
>>
>> is that expected to be?
>>
>> thanks
>>
>> Nicolas Santini
>>
>
>

RE: Errors During Compaction

2011-01-25 Thread Dan Hendry

Limited joy I would say :)  No long term damage at least.

 

I ended up deleting (moving to another disk) all the sstables which fixed the 
problem. I ran in to even more problems during repair (detailed in another 
recent email) but it seems to have worked regardless. Just to be safe, I am in 
the process of starting a ‘manual repair’ (copying SSTables from other nodes 
for this particular CF then restarting and running a cleanup + major 
compaction).

 

Any thoughts on what the root cause of this problem could be? It is somewhat 
worrying that a CF can randomly become corrupt bringing down the whole node. 
Cassandras handling of a corrupt CF (regardless of how rare an occurrence) is 
less than elegant. 

 

Dan

 

From: Aaron Morton [mailto:aa...@thelastpickle.com] 
Sent: January-25-11 16:03
To: user@cassandra.apache.org
Subject: Re: Errors During Compaction

 

Dan how did you go with this? More joy, less joy or a continuation of the 
current level of joy?

 

Aaron

 


On 24/01/2011, at 9:38 AM, Dan Hendry  wrote:

I have run into a strange problem and was hoping for suggestions on how to fix 
it (0.7.0). When compaction occurs on one node for what appears to be one 
specific column family, the following error pops up the Cassandra log. 
Compaction apparently fails and temp files don’t get cleaned up. After a while 
and what seems to be multiple failed compactions on the CF, the node runs out 
of disk space and crashes. Not sure if it is a related problem or a function of 
this being a heavily used column family but after failing to compact, 
compaction restarts on the same CF exacerbating the issue.

 

Problems with this specific node started earlier this weekend when it crashed 
with and OOM error. This is quite surprising since my memtable thresholds and 
GC settings have been tuned to run with quite a bit of overhead during normal 
operation (max heap usage usually <= 10 GB on a 12 GB heap, average usage of 
6-8 GB). I could not find anything abnormal in the logs which would prompt an 
OOM.

 

I will look things over tomorrow and try to provide a bit more information on 
the problem but as a solution, I was going to wipe out all SSTables for this CF 
on this node and then run a repair. Far from ideal, is this a reasonable 
solution?

 

 

ERROR [CompactionExecutor:1] 2011-01-23 14:10:29,855 
AbstractCassandraDaemon.java (line 91) Fatal exception in thread 
Thread[CompactionExecutor:1,1,RMI Runtime]

java.io.IOError: java.io.EOFException: attempted to skip -1983579368 bytes but 
only skipped 0

at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:78)

at 
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:178)

at 
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:143)

at 
org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:135)

at 
org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:38)

at 
org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284)

at 
org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)

at 
org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)

at 
org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:68)

at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)

at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)

at 
org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183)

at 
org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94)

at 
org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:323)

at 
org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:122)

at 
org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:92)

at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)

at java.util.concurrent.FutureTask.run(FutureTask.java:138)

at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)

Caused by: java.io.EOFException: attempted to skip -1983579368 bytes but only 
skipped 0

at 
org.apache.cassandra.io.sstable.IndexHelper.skipBloomFilter(IndexHelper.java:52)

at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:69)

... 20 more

 

Dan Hendry

(403) 660-2297

 

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.872 / Virus Database: 271.1.1/3402 - Release

Re: get_range_slices getting deleted rows

2011-01-25 Thread Roshan Dawrani

No, checking the key will not do.

You will need to check if row.getColumnSlice().getColumns() is empty or not.
That's what I do and it works for me.

On Wed, Jan 26, 2011 at 4:53 AM, Nick Santini wrote:

> thanks,
> so I need to check the returned slice for the key to verify that is a valid
> row and not a deleted one?
>
> Nicolas Santini
>
>
>
> On Wed, Jan 26, 2011 at 12:16 PM, Narendra Sharma <
> narendra.sha...@gmail.com> wrote:
>
>> Yes. See this http://wiki.apache.org/cassandra/FAQ#range_ghosts
>>
>> -Naren
>>
>>
>> On Tue, Jan 25, 2011 at 2:59 PM, Nick Santini wrote:
>>
>>> Hi,
>>> I'm trying a test scenario where I create 100 rows in a CF, then
>>> use get_range_slices to get all the rows, and I get 100 rows, so far so good
>>> then after the test I delete the rows using "remove" but without a column
>>> or super column, this deletes the row, I can confirm that cos if I try to
>>> get it with get_slice using the key I get nothing
>>>
>>> but then if I do get_range_slice again, where the range goes between new
>>> byte[0] and new byte[0] (therefore returning everything), I still get the
>>> 100 row keys
>>>
>>> is that expected to be?
>>>
>>> thanks
>>>
>>> Nicolas Santini
>>>
>>
>>
>

Re: Re-partitioning the cluster with nodetool: what's happening?

2011-01-25 Thread Aaron Morton

It can take a bit of thinking time for the nodes to work out what to stream, the bottom of this page http://wiki.apache.org/cassandra/Streaming talks about how to watch whats happening. If it does get stuck let us know. AaronOn 26 Jan, 2011,at 11:42 AM, buddhasystem  wrote:
Correction -- what I meant to say that I do see announcements about streaming
in the output, but these are stuck at 0%.

-- 
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-partitioning-the-cluster-with-nodetool-what-s-happening-tp5960843p5960851.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.

RE: the java client problem

2011-01-25 Thread Raoyixuan (Shandy)

I had find the loasschemafrom yaml  by jconsole,How to load the schema ?

From: Ashish [mailto:paliwalash...@gmail.com]
Sent: Friday, January 21, 2011 8:10 PM
To: user@cassandra.apache.org
Subject: Re: the java client problem

check cassandra-install-dir/conf/cassandra.yaml

start cassandra
connect via jconsole
find MBeans -> org.apache.cassandra.db -> 
StorageService -> Operations 
-> loadSchemaFromYAML

load the schema
and then try the example again.

HTH
ashish

2011/1/21 raoyixuan (Shandy) mailto:raoyix...@huawei.com>>
Which schema is it?
From: Ashish [mailto:paliwalash...@gmail.com]
Sent: Friday, January 21, 2011 7:57 PM
To: user@cassandra.apache.org
Subject: Re: the java client problem

you are missing the column family in your keyspace.

If you are using the default definitions of schema shipped with cassandra, 
ensure to load the schema from JMX.

thanks
ashish
2011/1/21 raoyixuan (Shandy) mailto:raoyix...@huawei.com>>
I exec the code as below by hector client:

package com.riptano.cassandra.hector.example;
import me.prettyprint.cassandra.serializers.StringSerializer;
import me.prettyprint.hector.api.Cluster;
import me.prettyprint.hector.api.Keyspace;
import me.prettyprint.hector.api.beans.HColumn;
import me.prettyprint.hector.api.exceptions.HectorException;
import me.prettyprint.hector.api.factory.HFactory;
import me.prettyprint.hector.api.mutation.Mutator;
import me.prettyprint.hector.api.query.ColumnQuery;
import me.prettyprint.hector.api.query.QueryResult;

public class InsertSingleColumn {
private static StringSerializer stringSerializer = StringSerializer.get();

public static void main(String[] args) throws Exception {
Cluster cluster = HFactory.getOrCreateCluster("TestCluster", 
"*.*.*.*:9160");

Keyspace keyspaceOperator = HFactory.createKeyspace("Shandy", cluster);

try {
Mutator mutator = HFactory.createMutator(keyspaceOperator, 
StringSerializer.get());
mutator.insert("jsmith", "Standard1", 
HFactory.createStringColumn("first", "John"));

ColumnQuery columnQuery = 
HFactory.createStringColumnQuery(keyspaceOperator);

columnQuery.setColumnFamily("Standard1").setKey("jsmith").setName("first");
QueryResult> result = columnQuery.execute();

System.out.println("Read HColumn from cassandra: " + result.get());
System.out.println("Verify on CLI with:  get 
Keyspace1.Standard1['jsmith'] ");

} catch (HectorException e) {
e.printStackTrace();
}
cluster.getConnectionManager().shutdown();
}

}

And it shows the error :

me.prettyprint.hector.api.exceptions.HInvalidRequestException: 
InvalidRequestException(why:unconfigured columnfamily Standard1)
  at 
me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:42)
  at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:95)
  at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:88)
  at 
me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:89)
  at 
me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:142)
  at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:129)
  at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:100)
  at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:106)
  at 
me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:149)
  at 
me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:146)
  at 
me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
 at 
me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:65)
  at 
me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:146)
  at me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:55)
  at 
com.riptano.cassandra.hector.example.InsertSingleColumn.main(InsertSingleColumn.java:21)
Caused by: InvalidRequestException(why:unconfigured columnfamily Standard1)
  at 
org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:16477)
  at 
org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:916)
  at 
org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:890)
  at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:93)
  ... 13 more

华为技术有限公司 Huawei Technologies Co., Ltd.[Image removed by sender. Company_logo]

Phone: 28358610
Mobile: 13425182943
Email: raoyix...@huawei.com

Re: [mapreduce] ColumnFamilyRecordWriter hidden reuse

2011-01-25 Thread Patrik Modesto

On Tue, Jan 25, 2011 at 19:09, Mick Semb Wever  wrote:

> In fact i have another problem (trying to write an empty byte[], or
> something, as a key, which put one whole row out of whack, ((one row in
> 25 million...))).
>
> But i'm debugging along the same code.
>
> I don't quite understand how the byte[] in
> ByteBuffer.wrap(key.getBytes(),...)
> gets clobbered.

Code snippet would help here.

> Well your key is a mutable Text object, so i can see some possibility
> depending on how hadoop uses these objects.
> Is there something to ByteBuffer.allocate(..) i'm missing...

I don't know, I'm quite new to Java (but with long C++ history).

> btw.
>  is "d.timestamp = System.currentTimeMillis();" ok?
>  shouldn't this be microseconds so that each mutation has a different
> timestamp? http://wiki.apache.org/cassandra/DataModel

You are correct that microseconds would be better but for the test it
doesn't matter that much.

Patrik

Re: [mapreduce] ColumnFamilyRecordWriter hidden reuse

2011-01-25 Thread Mck


> >  is "d.timestamp = System.currentTimeMillis();" ok?
> 
> You are correct that microseconds would be better but for the test it
> doesn't matter that much. 

Have you tried. I'm very new to cassandra as well, and always uncertain
as to what to expect...


> ByteBuffer bbKey = ByteBufferUtil.clone(ByteBuffer.wrap(key.getBytes(), 0, 
> key.getLength())); 

An alternative approach to your client-side cloning is 

  ByteBuffer bbKey = ByteBuffer.wrap(key.toString().getBytes(UTF_8)); 

Here at least it is obvious you are passing in the bytes from an immutable 
object.

As far as moving the clone(..) into ColumnFamilyRecordWriter.write(..)
won't this hurt performance? Normally i would _always_ agree that a
defensive copy of an array/collection argument be stored, but has this
intentionally not been done (or should it) because of large reduce jobs
(millions of records) and the performance impact here.

The key isn't the only potential live byte[]. You also have names and
values in all the columns (and supercolumns) for all the mutations.


~mck

37 matches

Mail list logo