It can handle some millions of columns, but not more like 10M. I mean, a request for such a row concentrates on a particular node, so the performance degrades.
> I also had idea for semi-ordered partitioner - instead of single MD5, to have two MD5's. works for us with wide row with about 40-50 M, but with lots of problems. my research with get_count() shows first minor problems at 14-15K columns in a row and then it just get worse. On Fri, Aug 23, 2013 at 2:47 AM, Takenori Sato <ts...@cloudian.com> wrote: > Hi Nick, > > > token and key are not same. it was like this long time ago (single MD5 > assumed single key) > > True. That reminds me of making a test with the latest 1.2 instead of our > current 1.0! > > > if you want ordered, you probably can arrange your data in a way so you > can get it in ordered fashion. > > Yeah, we have done for a long time. That's called a wide row, right? Or a > compound primary key. > > It can handle some millions of columns, but not more like 10M. I mean, a > request for such a row concentrates on a particular node, so the > performance degrades. > > > I also had idea for semi-ordered partitioner - instead of single MD5, > to have two MD5's. > > Sounds interesting. But, we need a fully ordered result. > > Anyway, I will try with the latest version. > > Thanks, > Takenori > > > On Thu, Aug 22, 2013 at 6:12 PM, Nikolay Mihaylov <n...@nmmm.nu> wrote: > >> my five cents - >> token and key are not same. it was like this long time ago (single MD5 >> assumed single key) >> >> if you want ordered, you probably can arrange your data in a way so you >> can get it in ordered fashion. >> for example long ago, i had single column family with single key and >> about 2-3 M columns - I do not suggest you to do it this way, because is >> wrong way, but it is easy to understand the idea. >> >> I also had idea for semi-ordered partitioner - instead of single MD5, to >> have two MD5's. >> then you can get semi-ordered ranges, e.g. you get ordered all cities in >> Canada, all cities in US and so on. >> however in this way things may get pretty non-ballanced >> >> Nick >> >> >> >> >> >> On Thu, Aug 22, 2013 at 11:19 AM, Takenori Sato <ts...@cloudian.com>wrote: >> >>> Hi, >>> >>> I am trying to implement a custom partitioner that evenly distributes, >>> yet preserves order. >>> >>> The partitioner returns a token by BigInteger as RandomPartitioner does, >>> while does a decorated key by string as OrderPreservingPartitioner does. >>> * for now, since IPartitioner<T> does not support different types for >>> token and key, BigInteger is simply converted to string >>> >>> Then, I played around with cassandra-cli. As expected, in my 3 nodes >>> test cluster, get/set worked, but list(get_range_slices) didn't. >>> >>> This came from a challenge to overcome a wide row scalability. So, I >>> want to make it work! >>> >>> I am aware that some efforts are required to make get_range_slices work. >>> But are there any other critical problems? For example, it seems there is >>> an assumption that token and key are the same. If this is throughout the >>> whole C* code, this partitioner is not practical. >>> >>> Or have your tried something similar? >>> >>> I would appreciate your feedback! >>> >>> Thanks, >>> Takenori >>> >> >> >