Data Model for One To Many - Itemcontainer - Items

2018-06-05 Thread malte

hi,

i have two CFs "ItemContainer" and "Items".

I used to have a secondary index in "Items" referring to the  
"Itemcontainer". Something like:


CREATE table items (key uuid primary key, container uuid, slot int 
CREATE INDEX items_container ON items(container)

i change the "container" cell quite often when changing the  
itemcontainer. Documentation says that a secondary index shouldnt be  
used in this case.


So i tried something like:

 primary key(container, key)

in items. now i can query all items for an itemcontainer just fine.  
but how do i put the item in another itemcontainer? you cant override  
parts of the primary key. so do i really have to delete the item and  
reinsert all the date with a different "container" field?


Doesn't this create a lot of tombstones? Also "Items" has like 20  
columns with maps and lists and everything...


any ideas?


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Secondary Index Cleanup

2018-03-02 Thread malte


We use 3.11.0 on Linux.



What's the C* version do you use? Sounds like the secondary index is very
out of sync with the parent cf.

On Fri, Mar 2, 2018 at 6:23 AM, Malte Krüger 
wrote:


hi,

we have an CF which is about 2 gb in size, it has a seondary index on one
field (UUID).

the index has a size on disk of about 10 gb. it only shrinks a little when
forcing a compaction through jmx.

if i use sstabledump i see a lot of these:

"partition" : {
  "key" : [ "123c50d1-1ceb-489d-8427-2f34065325f8" ],
  "position" : 306166973
},
"rows" : [
  {
"type" : "row",
"position" : 306167031,
"clustering" : [ "f28f46930805495aa7d6cba291d92e87" ],
"liveness_info" : { "tstamp" : "2017-10-30T16:49:37.160361Z" },
"cells" : [ ]
  },

...

normally i can find the key as an indexed field, but most of the keys in
the dump do no longer exist in the parent CF.

these keys are sometimes months old. (we have gc_grace_seconds set to 30
mins)

if i use nodetool rebuild_index it does not help, but if i drop the index
und recreate it size goes down  two several hundred mb!


what is the reason the cleanup does not work automatically and how can i
fix this?

-Malte


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org





--
Dikang





-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Cassandra 2.1.16 Release

2016-08-14 Thread Malte Pickhan
Hey,

I'd like to ask when you are going to release cassandra 2.1.16 especially
because of https://issues.apache.org/jira/browse/CASSANDRA-11850

Best,

Malte


Re: data not replicated on new node

2016-11-23 Thread Malte Pickhan
Not sure if it's really related, but we experienced something similar last
friday. I summarized it in the following Issue:

https://issues.apache.org/jira/browse/CASSANDRA-12947

Best,

Malte
2016-11-23 10:21 GMT+01:00 Oleksandr Shulgin :

> On Tue, Nov 22, 2016 at 5:23 PM, Bertrand Brelier <
> bertrand.brel...@gmail.com> wrote:
>
>> Hello Shalom.
>>
>> No I really went from 3.1.1 to 3.0.9 .
>>
> So you've just installed the 3.0.9 version and re-started with it?  I
> wonder if it's really supported?
>
> Regards,
> --
> Alex
>
>


Re: Unreliable JMX metrics

2017-01-17 Thread Malte Pickhan
Hey Guan,

If it's the case that you actually replaced the nodes and already assigned
new IP's to them this is probably common behaviour. Since Cassandra has a
retention policy to keep dead nodes for 72 hours.

Best,

Malte

2017-01-17 0:29 GMT+01:00 Sun, Guan :

> Hi all,
>
> We have a 6 node Cassandra cluster running on Amazon EC2. We replaced the
> nodes one by one while Cassandra was running several days ago. After the
> replacement, the nodetool status show that all the nodes were up and
> running. However, the JMX metrics about down nodes kept showing there were
> 6 nodes down for about 2.5 days before it became 0.
>
> Has anyone seen this before? Any idea why the JMX metrics takes so long to
> reflect the actual status of the cluster?
>
> Thanks,
> Guan
>



-- 
Malte Pickhan

Software Engineer

Zalando Payments SE & Co. KG

*POSTAL ADDRESS*
Zalando Payments SE & Co. KG
11501 Berlin

*OFFICE*
Zalando Payments SE & Co. KG
Hörder Hafenstraße 1-3
44263 Dortmund
Germany


E-Mail: malte.pick...@zalando.de
Web: *corporate.zalando.de <http://corporate.zalando.de/>*
Jobs: jobs.zalando.de

Zalando Payments SE & Co. KG
Company registration: Amtsgericht Charlottenburg, HRA 48907 B

VAT registration number: DE 289683421



General partner: Zalando SE, registered office: Berlin

Company registration: Amtsgericht Charlottenburg, HRA 48907 B

Management Board: Robert Gentz, David Schneider, Rubin Ritter
Chairperson of Supervisory: Lothar Lanz


Resources for fire drills

2017-03-01 Thread Malte Pickhan
Hi Cassandra users,

I am looking for some resources/guides for firedrill scenarios with apache 
cassandra.

Do you know anything like that?

Best,

Malte

Re: Resources for fire drills

2017-03-01 Thread Malte Pickhan
Yeah thats the point.

What I mean are some overview for basic scenarios for firedrills, so that you 
can exercise them with your team.

Best


> On 1 Mar 2017, at 11:01, benjamin roth  wrote:
> 
> Could you specify it a little bit? There are really a lot of things that can 
> go wrong.
> 
> 2017-03-01 10:59 GMT+01:00 Malte Pickhan  <mailto:malte.pick...@zalando.de>>:
> Hi Cassandra users,
> 
> I am looking for some resources/guides for firedrill scenarios with apache 
> cassandra.
> 
> Do you know anything like that?
> 
> Best,
> 
> Malte
> 



Re: Resources for fire drills

2017-03-01 Thread Malte Pickhan
Hi,

really cool that this discussion gets attention.

You are right my question was quite open.

For me it would already be helpful to compile a list like Ben started with 
scenarios that can happen to a cluster
and what actions/strategies you have to take to resolve the incident without 
loosing data and having a healthy cluster.

Ideally we would add some kind of rating of hard the scenario is to be resolved 
so that teams can go through a kind of learning curve.

For the beginning I think it would already be sufficient to document the steps 
how you can get a cluster into the situation which has been described
in the scenario.

Hope it’s a bit clearer now what I mean.

Is there some kind of community space where we could start a document for this 
purpose?

Best,

Malte

> On 1 Mar 2017, at 13:33, Stefan Podkowinski  wrote:
> 
> I've been thinking about this for a while, but haven't found a practical
> solution yet, although the term "fire drill" leaves a lot of room for
> interpretation. The most basic requirements I'd have for these kind of
> trainings would start with automated cluster provisioning for each
> scenario (either for teams or individuals) and provisioning of test data
> for the cluster, with optionally some kind of load generator constantly
> running in the background. I started to work on some Ansible scripts
> that would do that on AWS a couple of months ago, but it turned out to
> be a lot of work with all the details you have to take care of. So I'd
> be happy to hear about any existing resources on that as well!
> 
> 
> On 01.03.2017 10:59, Malte Pickhan wrote:
>> Hi Cassandra users,
>> 
>> I am looking for some resources/guides for firedrill scenarios with apache 
>> cassandra.
>> 
>> Do you know anything like that?
>> 
>> Best,
>> 
>> Malte
>> 



Secondary Index Cleanup

2018-03-02 Thread Malte Krüger

hi,

we have an CF which is about 2 gb in size, it has a seondary index on 
one field (UUID).


the index has a size on disk of about 10 gb. it only shrinks a little 
when forcing a compaction through jmx.


if i use sstabledump i see a lot of these:

    "partition" : {
  "key" : [ "123c50d1-1ceb-489d-8427-2f34065325f8" ],
  "position" : 306166973
    },
    "rows" : [
  {
    "type" : "row",
    "position" : 306167031,
    "clustering" : [ "f28f46930805495aa7d6cba291d92e87" ],
    "liveness_info" : { "tstamp" : "2017-10-30T16:49:37.160361Z" },
    "cells" : [ ]
  },

...

normally i can find the key as an indexed field, but most of the keys in 
the dump do no longer exist in the parent CF.


these keys are sometimes months old. (we have gc_grace_seconds set to 30 
mins)


if i use nodetool rebuild_index it does not help, but if i drop the 
index und recreate it size goes down  two several hundred mb!



what is the reason the cleanup does not work automatically and how can i 
fix this?


-Malte


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org