Wouldn't shock me if shuffle wasn't all that performant (and not knock on
shuffle...our case is somewhat specific).
We added 3 nodes with num_tokens=256 and worked great, the load was evenly
spread.
On Sun, Mar 24, 2013 at 1:14 PM, aaron morton wrote:
> We initially tried to run a shuffle, howev
On Mon, Mar 25, 2013 at 1:35 AM, aaron morton wrote:
> I tried to wrap 'name' to bytes('name'), but it would throw "can not parse
>> FUNCTION_CALL as hex bytes", seems this does not work.
>>
>>> What was the statement you used and what was the error.
>
OK, I have tried using ascii code 6e616d65(n
Thanks Aaron. I have a hypothetical question.
Assume you have four nodes and a snapshot is taken. The following day if a
node goes down and data is corrupt through user error then how do you use
the previouus nights snapshots?
Would you replace the faulty node first and then restore last nights
> There are advantages and disadvantages in both approaches. What are people
> doing in their production systems?
Generally a mix of snapshots+rsync or https://github.com/synack/tablesnap to
get things off node.
Cheers
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand
> "select CPUTime,User,site from CF(or tablename) where user=xxx and
> Jobtype=xxx"
Even thought cassandra has tables and looks like a RDBMS it's not.
Queries with multiple secondary index clauses will not perform as well as those
with none.
There is plenty of documentation here http://www.da
The best thing to do is start with a look at ByteOrderedPartitoner and
AbstractByteOrderedPartitioner.
You'll want to create a new TimeUUIDToken extends Token and a new
UUIDPartitioner that extends AbstractPartitioner<>
Usual disclaimer that ordered partitioners cause problems with load balanc
>
> I tried to wrap 'name' to bytes('name'), but it would throw "can not parse
> FUNCTION_CALL as hex bytes", seems this does not work.
What was the statement you used and what was the error.
> So the stored bytes are the same, right?
Yes.
-
Aaron Morton
Freelance Cassandra
It's always been like that see
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Column.java#L231
Chees
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand
@aaronmorton
http://www.thelastpickle.com
On 24/03/2013, at 4:18 PM, dong.yajun wrote:
> Device:tpskB_read/skB_wrtn/skB_readkB_wrtn
> xvdap10.13 0.00 1.07 0 16
> xvdb474.20 13524.5325.33 202868380
> xvdc469.87 13455.7330.40 201836456
Perchan
> We initially tried to run a shuffle, however it seemed to be going really
> slowly (very little progress by watching "cassandra-shuffle ls | wc -l" after
> 5-6 hours and no errors in logs),
My guess is that shuffle not designed to be as efficient as possible as it is
only used once. Was it con
> compaction needs some disk I/O. Slowing down our compaction will improve
> overall system performance. Of course, you don't want to go too slow and fall
> behind too much.
In this case I was thinking of the memory use.
Compaction tasks are a bit like a storm of reads. If you are having problem
> I could imagine a scenario where a hint was replayed to a replica after all
> replicas had purged their tombstones
Scratch that, the hints are TTL'd with the lowest gc_grace.
Ticket closed https://issues.apache.org/jira/browse/CASSANDRA-5379
Cheers
-
Aaron Morton
Freelance Ca
> From this mailing list I found this Github project that is doing something
> similar by looking at the commit logs:
> https://github.com/carloscm/cassandra-commitlog-extract
IMHO tailing the logs is fragile, and you may be better off handling it at the
application level.
> But is there other
Biggest advantage of Cassandra is it's ability to scale linearly as more
nodes are added and it's ability to handle node failures.
Also to get the maximum performance from Cassandra you need to be making
multiple requests in parallel.
On Sun, Mar 24, 2013 at 3:15 AM, 张刚 wrote:
> Hello,
> I am
For example,each row represent a job record,it has fields like
"user","site","CPUTime","datasize","JobType"...
The fields in CF is fixed,just like a table.The query like this "select
CPUTime,User,site from CF(or tablename) where user=xxx and Jobtype=xxx"
Best regards
2013/3/24 cem
> Hi,
>
> Co
Hi,
I store in my system rows where the key is a UUID version1, TimeUUID. I
would like to maintain rows ordered by time. I know that in this case, it
is recomended to use an external CF where column names are UUID ordered by
time. But in my use case this is not possible, so I would like to use a
c
Hi,
Could you provide some other details about your schema design and queries?
It is very hard to tell anything.
Regards,
Cem
On Sun, Mar 24, 2013 at 12:40 PM, dong.yajun wrote:
> Hello,
>
> I'd suggest you to take look at the difference between Nosql and RDMS.
>
> Best,
>
> On Sun, Mar 24, 2
Hello,
I'd suggest you to take look at the difference between Nosql and RDMS.
Best,
On Sun, Mar 24, 2013 at 5:15 PM, 张刚 wrote:
> Hello,
> I am new to Cassandra.I do some test on a single machine. I install
> Cassandra with a binary tarball distribution.
> I create a CF to store the data that
On Sun, Mar 24, 2013 at 1:45 AM, aaron morton wrote:
> But a error is thrown saying "can not parse name as hex bytes".
>
> If the comparator is Bytes then the column names need to be a hex string.
>
> The easiest thing to do is create a CF where the comparator is UTF8Type so
> you can use string c
Thanks, aaron.
On Sun, Mar 24, 2013 at 1:45 AM, aaron morton wrote:
> But a error is thrown saying "can not parse name as hex bytes".
>
> If the comparator is Bytes then the column names need to be a hex string.
>
> The easiest thing to do is create a CF where the comparator is UTF8Type so
> you
Hello,
I am new to Cassandra.I do some test on a single machine. I install
Cassandra with a binary tarball distribution.
I create a CF to store the data that get from MySQL. The CF has the same
fields as the table in MySQL. So it looks like a table.
I do the same select from the CF in Cassandra and
21 matches
Mail list logo