If a user has millions of followers, is there millions of iterate? (ref Twissandra)

2010-04-15 Thread Allen He
Hello folks,

When Twissandra  (Twitter clone example for
Cassandra) post a tweet, it iterate all of the followers to insert a
tweet_id to their time lines(see highlight):

def save_tweet(tweet_id, user_id, tweet):
"""
Saves the tweet record.
"""
# Generate a timestamp, and put it in the tweet record
raw_ts = int(time.time() * 1e6)
tweet['_ts'] = raw_ts
ts = _long(raw_ts)
encoded = dict(((k, json.dumps(v)) for k, v in tweet.iteritems()))
# Insert the tweet, then into the user's timeline, then into the public one
TWEET.insert(str(tweet_id), encoded)
USERLINE.insert(str(user_id), {ts: str(tweet_id)})
USERLINE.insert(PUBLIC_USERLINE_KEY, {ts: str(tweet_id)})
# Get the user's followers, and insert the tweet into all of their streams
follower_ids = [user_id] + get_follower_ids(user_id)
***for** **follower_id** **in** **follower_ids**:*
***TIMELINE**.**insert**(**str**(**follower_id**),**
**{**ts**:** **str**(**tweet_id**)})*
*
*
*My question is, If a user has millions of followers, is there
millions of iterate?*

*
Sorry for my English :)

Thanks!
*


Re: Time-series data model

2010-04-15 Thread Ilya Maykov
Hi Jean-Pierre,

I'm investigating using Cassandra for a very similar use case, maybe
we can chat and compare notes sometime. But basically, I think you
want to pull the metric name into the row key and use simple CF
instead of SCF. So, your example:

"my_server_1": {
   "cpu_usage": {
   {ts: 1271248215, value: 87 },
   {ts: 1271248220, value: 34 },
   {ts: 1271248225, value: 23 },
   {ts: 1271248230, value: 49 }
   }
   "ping_response": {
   {ts: 1271248201, value: 0.345 },
   {ts: 1271248211, value: 0.423 },
   {ts: 1271248221, value: 0.311 },
   {ts: 1271248232, value: 0.582 }
   }
}

becomes:

"my_server_1:cpu_usage" : {
   {ts: 1271248215, value: 87 },
   {ts: 1271248220, value: 34 },
   {ts: 1271248225, value: 23 },
   {ts: 1271248230, value: 49 }
},
"my_server_1:ping_response": {
   {ts: 1271248201, value: 0.345 },
   {ts: 1271248211, value: 0.423 },
   {ts: 1271248221, value: 0.311 },
   {ts: 1271248232, value: 0.582 }
   }

This keeps your rows smaller and row count higher (which I think will
load-balance better). It also avoids large super columns, which you
don't want because columns inside a super column are not indexed so
accessing them can be expensive.

The time-based sharding will be necessary eventually if you plan to
keep your data forever, because without it your rows will get so big
that they don't fit in memory and crash Cassandra during a compaction.
But realistically, Cassandra can support A LOT of columns and pretty
big rows. Suppose you sample your stats every minute and use
"device-id:metric-name" as the row key. Google calculator claims there
are ~526k minutes in a year, so if you keep high-resolution data
forever you would only have half a million columns per row after 1
year. Assuming 128 bytes per data point (which seems way high for a
(long, double, long) 3-tuple), that's only 64MB of data per row. If
you thin out older, less relevant data, you could last a lot longer
before you have to split rows. Furthermore, splitting old data off
into another row is easy because you know old data is not being
modified at the time of the split, so you don't have to worry about
the RMW problem or external locking of any kind. So I would start
without time-based sharding instead of over-engineering for it, it
makes everything else much simpler.

-- Ilya

P.S. Credit for the above view point goes to Ryan King, who made this
argument to me in a discussion we had recently about this exact
problem.

2010/4/14 Ted Zlatanov :
> On Wed, 14 Apr 2010 15:02:29 +0200 "Jean-Pierre Bergamin"  
> wrote:
>
> JB> The metrics are stored together with a timestamp. The queries we want to
> JB> perform are:
> JB>  * The last value of a specific metric of a device
> JB>  * The values of a specific metric of a device between two timestamps t1 
> and
> JB> t2
>
> Make your key "devicename-metricname-MMDD-HHMM" (with whatever time
> sharding makes sense to you; I use UTC by-hours and by-day in my
> environment).  Then your supercolumn is the collection time as a
> LongType and your columns inside the supercolumn can express the metric
> in detail (collector agent, detailed breakdown, etc.).
>
> If you want your clients to discover the available metrics, you may need
> to keep an external index.  But from your spec that doesn't seem necessary.
>
> Ted
>
>


Row key: string or binary (byte[])?

2010-04-15 Thread Roland Hänel
Is there any effort ongoing to make the row key a binary (byte[]) instead of
a string? In the current cassandra.thrift file (0.6.0), I find:

const string VERSION = "2.1.0"
[...]
struct KeySlice {
1: required *string* key,
2: required list columns,
}

while on the current (?) SVN
https://svn.apache.org/repos/asf/cassandra/trunk/interface/cassandra.thriftit
reads:

const string VERSION = "4.0.0"
[...]
struct KeySlice {
1: required *binary* key,
2: required list columns,
}

Thanks for enlightening me. :-)

Greetings,
Roland


Re: If a user has millions of followers, is there millions of iterate? (ref Twissandra)

2010-04-15 Thread gabriele renzi
On Thu, Apr 15, 2010 at 9:56 AM, Allen He  wrote:
> Hello folks,
>
> When Twissandra (Twitter clone example for Cassandra) post a tweet, it
> iterate all of the followers to insert a tweet_id to their time lines(see


> for follower_id in follower_ids:
> TIMELINE.insert(str(follower_id), {ts: str(tweet_id)})
>
>
>
> My question is, If a user has millions of followers, is there millions of
> iterate?

I never looked at the twissandra code but it looks like that. It is
probably a trade off: either you store the tweets in each timeline and
when a user wants to read them you fetch them all (so putting the
burden on read time) or you do it like this and put it on the write.
Since writes are cheap in cassandra, and reads are more frequents,
this seems to make sense.


PS
  I think it should use batch_mutate anyway so that only one operation
is sent over the network


AssertionError: DecoratedKey(...) != DecoratedKey(...)

2010-04-15 Thread Ran Tavory
When restarting one of the nodes in my cluster I found this error in the
log. What does this mean?

 INFO [GC inspection] 2010-04-15 05:03:04,898 GCInspector.java (line 110) GC
for ConcurrentMarkSweep: 712 ms, 11149016 reclaimed leaving 442336680 used;
max is 4432068608
ERROR [HINTED-HANDOFF-POOL:1] 2010-04-15 05:03:17,948
DebuggableThreadPoolExecutor.java (line 94) Error in executor futuretask
java.util.concurrent.ExecutionException: java.lang.AssertionError:
DecoratedKey(163143070370570938845670096830182058073,
1K2i35+B8RuuRDP7Gwz3Xw==) !=
DecoratedKey(163143368384879375649994309361429628039,
4k54mGvj7JoT5rBH68K+9A==) in
/outbrain/cassandra/data/outbrain/DocumentMapping-305-Data.db
at
java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at
org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:86)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:888)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.AssertionError:
DecoratedKey(163143070370570938845670096830182058073,
1K2i35+B8RuuRDP7Gwz3Xw==) !=
DecoratedKey(163143368384879375649994309361429628039,
4k54mGvj7JoT5rBH68K+9A==) in
/outbrain/cassandra/data/outbrain/DocumentMapping-305-Data.db
at
org.apache.cassandra.db.filter.SSTableSliceIterator$ColumnGroupReader.(SSTableSliceIterator.java:127)
at
org.apache.cassandra.db.filter.SSTableSliceIterator.(SSTableSliceIterator.java:59)
at
org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:63)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:830)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:750)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:719)
at
org.apache.cassandra.db.HintedHandOffManager.sendMessage(HintedHandOffManager.java:122)
at
org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:250)
at
org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:80)
at
org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:280)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
... 2 more


Re: Time-series data model

2010-04-15 Thread Jean-Pierre Bergamin

Am 14.04.2010 15:22, schrieb Ted Zlatanov:

On Wed, 14 Apr 2010 15:02:29 +0200 "Jean-Pierre Bergamin"  
wrote:

JB>  The metrics are stored together with a timestamp. The queries we want to
JB>  perform are:
JB>   * The last value of a specific metric of a device
JB>   * The values of a specific metric of a device between two timestamps t1 
and
JB>  t2

Make your key "devicename-metricname-MMDD-HHMM" (with whatever time
sharding makes sense to you; I use UTC by-hours and by-day in my
environment).  Then your supercolumn is the collection time as a
LongType and your columns inside the supercolumn can express the metric
in detail (collector agent, detailed breakdown, etc.).
   
Just for my understanding. What is "time sharding"? I couldn't find an 
explanation somewhere. Do you mean that the time-series data is rolled 
up in 5 minues, 1 hour, 1 day etc. slices?


So this would be defined as:
CompareWith="UTF8Type"  CompareSubcolumnsWith="LongType" />


So when i want to read all values of one metric between two timestamps 
t0 and t1, I'd have to read the supercolumns that match a key range 
(device1:metric1:t0 - device1:metric1:t1) and then all the supercolumns 
for this key?



Regards
James


inserting rows in columns inside a supercolumn

2010-04-15 Thread Julio Carlos Barrera Juez
Hi all,

I'm working with Cassandra 0.5 and Thrift API. I have a simple doubt:

I want to insert a row in columns inside a supercolumn, like this (without
timestamps):

SuperColumnNameA ==> keyA valueA ==> columnB ==> key1 value1

 ==> key2 value2

 ==> key3 value3
==> columnC ==> key4
value4

 ==> key5 value5
   ==> keyD valueD ==> columnE ==> *key6 value6*

 ==> *key7 value7*
==> columnF ==>* key8
value8*

 ==> *key9 value9*

For instance, I want to insert only key-values 6,7,8 and 9, but when I try
it, I destroy all the others values. What is the correct mode to do it. I
have tries obtaining the supercolumn and adding more values, bath_insert(),
etc, but I always failing.

Thank you.


Re: AssertionError: DecoratedKey(...) != DecoratedKey(...)

2010-04-15 Thread Gary Dusbabek
Ran,

It looks like you're seeing
https://issues.apache.org/jira/browse/CASSANDRA-866.  It's fixed in
0.6.1.

Gary

On Thu, Apr 15, 2010 at 04:06, Ran Tavory  wrote:
> When restarting one of the nodes in my cluster I found this error in the
> log. What does this mean?
>
>  INFO [GC inspection] 2010-04-15 05:03:04,898 GCInspector.java (line 110) GC
> for ConcurrentMarkSweep: 712 ms, 11149016 reclaimed leaving 442336680 used;
> max is 4432068608
> ERROR [HINTED-HANDOFF-POOL:1] 2010-04-15 05:03:17,948
> DebuggableThreadPoolExecutor.java (line 94) Error in executor futuretask
> java.util.concurrent.ExecutionException: java.lang.AssertionError:
> DecoratedKey(163143070370570938845670096830182058073,
> 1K2i35+B8RuuRDP7Gwz3Xw==) !=
> DecoratedKey(163143368384879375649994309361429628039,
> 4k54mGvj7JoT5rBH68K+9A==) in
> /outbrain/cassandra/data/outbrain/DocumentMapping-305-Data.db
>         at
> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>         at
> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:86)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:888)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.AssertionError:
> DecoratedKey(163143070370570938845670096830182058073,
> 1K2i35+B8RuuRDP7Gwz3Xw==) !=
> DecoratedKey(163143368384879375649994309361429628039,
> 4k54mGvj7JoT5rBH68K+9A==) in
> /outbrain/cassandra/data/outbrain/DocumentMapping-305-Data.db
>         at
> org.apache.cassandra.db.filter.SSTableSliceIterator$ColumnGroupReader.(SSTableSliceIterator.java:127)
>         at
> org.apache.cassandra.db.filter.SSTableSliceIterator.(SSTableSliceIterator.java:59)
>         at
> org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:63)
>         at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:830)
>         at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:750)
>         at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:719)
>         at
> org.apache.cassandra.db.HintedHandOffManager.sendMessage(HintedHandOffManager.java:122)
>         at
> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:250)
>         at
> org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:80)
>         at
> org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:280)
>         at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         ... 2 more
>


Re: Row key: string or binary (byte[])?

2010-04-15 Thread Gary Dusbabek
2010/4/15 Roland Hänel :
> Is there any effort ongoing to make the row key a binary (byte[]) instead of
> a string?

Yes. It went into trunk last night. Please see
https://issues.apache.org/jira/browse/CASSANDRA-767.

Gary.

> In the current cassandra.thrift file (0.6.0), I find:
>
> const string VERSION = "2.1.0"
> [...]
> struct KeySlice {
>     1: required string key,
>     2: required list columns,
> }
>
> while on the current (?) SVN
> https://svn.apache.org/repos/asf/cassandra/trunk/interface/cassandra.thrift
> it reads:
>
> const string VERSION = "4.0.0"
> [...]
> struct KeySlice {
> 1: required binary key,
> 2: required list columns,
> }
>
> Thanks for enlightening me. :-)
>
> Greetings,
> Roland
>
>


How to implement TOP TEN in Cassandra

2010-04-15 Thread Allen He
Hi , all

How to implement *TOP TEN* in Cassandra,

For example , *Top ten stories in Digg.com*

How to model.

Thanks


Get super-columns using SimpleCassie

2010-04-15 Thread Yésica Rey

I'm using SimpleCassie like cassandra client.
I have a question: can I get all super-columns that there in one 
column-family?
If yes, how can i do it? 


Regards!


Re: TException: Error: TSocket: timed out reading 1024 bytes from 10.1.1.27:9160

2010-04-15 Thread Jonathan Ellis
sounds like https://issues.apache.org/jira/browse/THRIFT-347

On Wed, Apr 14, 2010 at 11:58 PM, richard yao  wrote:
> I am having a try on cassandra, and I use php to access cassandra by thrift
> API.
> I got an error like this:
>     TException:  Error: TSocket: timed out reading 1024 bytes from
> 10.1.1.27:9160
> What's wrong?
> Thanks.


Re: TException: Error: TSocket: timed out reading 1024 bytes from 10.1.1.27:9160

2010-04-15 Thread richard yao
Thank you!


Re: AssertionError: DecoratedKey(...) != DecoratedKey(...)

2010-04-15 Thread Ran Tavory
yes, this looks like the same issue, thanks Gary.

Other than seeing the errors in the log I haven't seen any other
irregularities. (maybe there are, but they haven't surfaced). Does this
assertion mean data corruption or something else that's worth waiting to
0.6.1 for?

On Thu, Apr 15, 2010 at 2:00 PM, Gary Dusbabek  wrote:

> Ran,
>
> It looks like you're seeing
> https://issues.apache.org/jira/browse/CASSANDRA-866.  It's fixed in
> 0.6.1.
>
> Gary
>
> On Thu, Apr 15, 2010 at 04:06, Ran Tavory  wrote:
> > When restarting one of the nodes in my cluster I found this error in the
> > log. What does this mean?
> >
> >  INFO [GC inspection] 2010-04-15 05:03:04,898 GCInspector.java (line 110)
> GC
> > for ConcurrentMarkSweep: 712 ms, 11149016 reclaimed leaving 442336680
> used;
> > max is 4432068608
> > ERROR [HINTED-HANDOFF-POOL:1] 2010-04-15 05:03:17,948
> > DebuggableThreadPoolExecutor.java (line 94) Error in executor futuretask
> > java.util.concurrent.ExecutionException: java.lang.AssertionError:
> > DecoratedKey(163143070370570938845670096830182058073,
> > 1K2i35+B8RuuRDP7Gwz3Xw==) !=
> > DecoratedKey(163143368384879375649994309361429628039,
> > 4k54mGvj7JoT5rBH68K+9A==) in
> > /outbrain/cassandra/data/outbrain/DocumentMapping-305-Data.db
> > at
> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
> > at java.util.concurrent.FutureTask.get(FutureTask.java:83)
> > at
> >
> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:86)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:888)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> > at java.lang.Thread.run(Thread.java:619)
> > Caused by: java.lang.AssertionError:
> > DecoratedKey(163143070370570938845670096830182058073,
> > 1K2i35+B8RuuRDP7Gwz3Xw==) !=
> > DecoratedKey(163143368384879375649994309361429628039,
> > 4k54mGvj7JoT5rBH68K+9A==) in
> > /outbrain/cassandra/data/outbrain/DocumentMapping-305-Data.db
> > at
> >
> org.apache.cassandra.db.filter.SSTableSliceIterator$ColumnGroupReader.(SSTableSliceIterator.java:127)
> > at
> >
> org.apache.cassandra.db.filter.SSTableSliceIterator.(SSTableSliceIterator.java:59)
> > at
> >
> org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:63)
> > at
> >
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:830)
> > at
> >
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:750)
> > at
> >
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:719)
> > at
> >
> org.apache.cassandra.db.HintedHandOffManager.sendMessage(HintedHandOffManager.java:122)
> > at
> >
> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:250)
> > at
> >
> org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:80)
> > at
> >
> org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:280)
> > at
> > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> > at
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> > at
> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> > at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> > ... 2 more
> >
>


Re: AssertionError: DecoratedKey(...) != DecoratedKey(...)

2010-04-15 Thread Gary Dusbabek
No data corruption.  There was a bug in the way that the index was
scanned that was manifesting itself when when the index got bigger
than 2GB.

Gary.


On Thu, Apr 15, 2010 at 08:03, Ran Tavory  wrote:
> yes, this looks like the same issue, thanks Gary.
> Other than seeing the errors in the log I haven't seen any other
> irregularities. (maybe there are, but they haven't surfaced). Does this
> assertion mean data corruption or something else that's worth waiting to
> 0.6.1 for?
>
> On Thu, Apr 15, 2010 at 2:00 PM, Gary Dusbabek  wrote:
>>
>> Ran,
>>
>> It looks like you're seeing
>> https://issues.apache.org/jira/browse/CASSANDRA-866.  It's fixed in
>> 0.6.1.
>>
>> Gary
>>
>> On Thu, Apr 15, 2010 at 04:06, Ran Tavory  wrote:
>> > When restarting one of the nodes in my cluster I found this error in the
>> > log. What does this mean?
>> >
>> >  INFO [GC inspection] 2010-04-15 05:03:04,898 GCInspector.java (line
>> > 110) GC
>> > for ConcurrentMarkSweep: 712 ms, 11149016 reclaimed leaving 442336680
>> > used;
>> > max is 4432068608
>> > ERROR [HINTED-HANDOFF-POOL:1] 2010-04-15 05:03:17,948
>> > DebuggableThreadPoolExecutor.java (line 94) Error in executor futuretask
>> > java.util.concurrent.ExecutionException: java.lang.AssertionError:
>> > DecoratedKey(163143070370570938845670096830182058073,
>> > 1K2i35+B8RuuRDP7Gwz3Xw==) !=
>> > DecoratedKey(163143368384879375649994309361429628039,
>> > 4k54mGvj7JoT5rBH68K+9A==) in
>> > /outbrain/cassandra/data/outbrain/DocumentMapping-305-Data.db
>> >         at
>> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>> >         at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>> >         at
>> >
>> > org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:86)
>> >         at
>> >
>> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:888)
>> >         at
>> >
>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> >         at java.lang.Thread.run(Thread.java:619)
>> > Caused by: java.lang.AssertionError:
>> > DecoratedKey(163143070370570938845670096830182058073,
>> > 1K2i35+B8RuuRDP7Gwz3Xw==) !=
>> > DecoratedKey(163143368384879375649994309361429628039,
>> > 4k54mGvj7JoT5rBH68K+9A==) in
>> > /outbrain/cassandra/data/outbrain/DocumentMapping-305-Data.db
>> >         at
>> >
>> > org.apache.cassandra.db.filter.SSTableSliceIterator$ColumnGroupReader.(SSTableSliceIterator.java:127)
>> >         at
>> >
>> > org.apache.cassandra.db.filter.SSTableSliceIterator.(SSTableSliceIterator.java:59)
>> >         at
>> >
>> > org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:63)
>> >         at
>> >
>> > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:830)
>> >         at
>> >
>> > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:750)
>> >         at
>> >
>> > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:719)
>> >         at
>> >
>> > org.apache.cassandra.db.HintedHandOffManager.sendMessage(HintedHandOffManager.java:122)
>> >         at
>> >
>> > org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:250)
>> >         at
>> >
>> > org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:80)
>> >         at
>> >
>> > org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:280)
>> >         at
>> > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>> >         at
>> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>> >         at
>> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>> >         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>> >         at
>> >
>> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> >         ... 2 more
>> >
>
>


Re: Time-series data model

2010-04-15 Thread Ted Zlatanov
On Thu, 15 Apr 2010 11:27:47 +0200 Jean-Pierre Bergamin  
wrote: 

JB> Am 14.04.2010 15:22, schrieb Ted Zlatanov:
>> On Wed, 14 Apr 2010 15:02:29 +0200 "Jean-Pierre Bergamin"  
>> wrote:
>> 
JB> The metrics are stored together with a timestamp. The queries we want to
JB> perform are:
JB> * The last value of a specific metric of a device
JB> * The values of a specific metric of a device between two timestamps t1 and
JB> t2
>> 
>> Make your key "devicename-metricname-MMDD-HHMM" (with whatever time
>> sharding makes sense to you; I use UTC by-hours and by-day in my
>> environment).  Then your supercolumn is the collection time as a
>> LongType and your columns inside the supercolumn can express the metric
>> in detail (collector agent, detailed breakdown, etc.).
>> 
JB> Just for my understanding. What is "time sharding"? I couldn't find an
JB> explanation somewhere. Do you mean that the time-series data is rolled
JB> up in 5 minues, 1 hour, 1 day etc. slices?

Yes.  The usual meaning of "shard" in RDBMS world is to segment your
database by some criteria, e.g. US vs. Europe in Amazon AWS because
their data centers are laid out so.  I was taking a linguistic shortcut
to mean "break down your rows by some convenient criteria."  You can
actually set up your Partitioner in Cassandra to literally shard your
keyspace rows based on the key, but I just meant "slice" in my note.

JB> So this would be defined as:
JB>  CompareWith="UTF8Type"  CompareSubcolumnsWith="LongType" />

JB> So when i want to read all values of one metric between two timestamps
JB> t0 and t1, I'd have to read the supercolumns that match a key range
JB> (device1:metric1:t0 - device1:metric1:t1) and then all the
JB> supercolumns for this key?

Yes.  This is a single multiget if you can construct the key range
explicitly.  Cassandra loads a lot of this in memory already and filters
it after the fact, that's why it pays to slice your keys and to stitch
them together on the client side if you have to go across a time
boundary.  You'll also get better key load balancing with deeper slicing
if you use the randomizing partitioner.

In the result set, you'll get each matching supercolumn with all the
columns inside it.  You may have to page through supercolumns.

Ted



Re: SuperColumns

2010-04-15 Thread Ted Zlatanov
On Wed, 14 Apr 2010 23:34:52 -0700 Vijay  wrote: 

V> On Wed, Apr 14, 2010 at 10:28 PM, Christian Torres wrote:

>> I'm defining a ColumnFamily (Table) type Super, It's posible to have a
>> SuperColumn inside another SuperColumn or SuperColumns can only have normal
>> columns?

V> Yes a super column can only have columns in it.

Jonathan Ellis indicated that in the future we may get nested
SuperColumns but there's no ETA on this functionality AFAIK.

Ted



Re: SuperColumns

2010-04-15 Thread Christian Torres
Ok, thanks both

2010/4/15 Ted Zlatanov 

> On Wed, 14 Apr 2010 23:34:52 -0700 Vijay  wrote:
>
> V> On Wed, Apr 14, 2010 at 10:28 PM, Christian Torres  >wrote:
>
> >> I'm defining a ColumnFamily (Table) type Super, It's posible to have a
> >> SuperColumn inside another SuperColumn or SuperColumns can only have
> normal
> >> columns?
>
> V> Yes a super column can only have columns in it.
>
> Jonathan Ellis indicated that in the future we may get nested
> SuperColumns but there's no ETA on this functionality AFAIK.
>
> Ted
>
>


-- 
Christian Torres * Desarrollador Web * Guegue.com *
Celular: +505 84 65 92 62 * Loving of the Programming


Re: How to implement TOP TEN in Cassandra

2010-04-15 Thread Pablo Viojo
Your question is too general to give you an appropiate answer. Can you
elaborate it a little more?

Regards,

Pablo Viojo
Project lead on data storage et al.
http://www.needish.com | pa...@needish.com

http://twitter.com/tiopaul (@tiopaul) | LinkedIn profile:
http://cl.linkedin.com/in/pviojo


On Thu, Apr 15, 2010 at 7:39 AM, Allen He  wrote:

> Hi , all
>
> How to implement *TOP TEN* in Cassandra,
>
> For example , *Top ten stories in Digg.com*
>
> How to model.
>
> Thanks
>


Re: How to implement TOP TEN in Cassandra

2010-04-15 Thread Jesse McConnell
http://arin.me/code/wtf-is-a-supercolumn-cassandra-data-model

if memory serves that article explains it

cheers,
jesse

--
jesse mcconnell
jesse.mcconn...@gmail.com



On Thu, Apr 15, 2010 at 09:36, Pablo Viojo  wrote:
> Your question is too general to give you an appropiate answer. Can you
> elaborate it a little more?
>
> Regards,
> Pablo Viojo
> Project lead on data storage et al.
> http://www.needish.com | pa...@needish.com
>
> http://twitter.com/tiopaul (@tiopaul) | LinkedIn profile:
> http://cl.linkedin.com/in/pviojo
>
>
> On Thu, Apr 15, 2010 at 7:39 AM, Allen He  wrote:
>>
>> Hi , all
>> How to implement TOP TEN in Cassandra,
>> For example , Top ten stories in Digg.com
>> How to model.
>> Thanks
>


framed transport

2010-04-15 Thread Lee Parker
What is the benefit of moving to framed transport as opposed to buffered
transport?

Lee Parker
l...@spredfast.com

[image: Spredfast]


RackAware and replication strategy

2010-04-15 Thread Ran Tavory
I'm reading this on this page
http://wiki.apache.org/cassandra/ArchitectureInternals :

 AbstractReplicationStrategy controls what nodes get secondary, tertiary,
> etc. replicas of each key range. Primary replica is always determined by the
> token ring (in TokenMetadata) but you can do a lot of variation with the
> others. RackUnaware just puts replicas on the next N-1 nodes in the ring.
> RackAware puts the first non-primary replica in the next node in the ring in
> ANOTHER data center than the primary; then the remaining replicas in the
> same as the primary.


So I just want to make sure I got this right and that documentation is up to
date.
I have two data centers and rack-aware.

When replication factor is 2: is it always the case that the primary replica
goes to one DC and the second replica to the second DC?
When replication factor is 3: First replica in DC1, second in DC2 and third
in DC1
When replication factor is 4: First replica in DC1, second in DC2, third in
DC1, fourth in DC1 etc

If I have 4 hosts in each DC, which replication factors make sense?
N=1 - When I don't care about losing data, cool
N=2 - When I want to make sure each DC has a copy; useful for local fast
access and allows recovery if only one host down.
N=3 - If I want to make sure each DC has a copy plus recovery can be made
faster in certain cases, and more resilient to two hosts down.
N=4 - Like N=3 but even more resilient. etc

Say I want to have two replicas in each DC, can this be done?


Re: framed transport

2010-04-15 Thread Eric Evans
On Thu, 2010-04-15 at 09:49 -0500, Lee Parker wrote:
> What is the benefit of moving to framed transport as opposed to
> buffered transport? 

The framed transport probably provides better (smarter) buffering, but
it was added in Thrift to support asynchronous servers. In a perfect
world, there wouldn't even be a choice; this is a Thrift implementation
detail that leaked into the APIs.

The more relevant question here is interoperability, since not all
transports are covered. For example, if you wanted to use Python and
Twisted for clients, you're going to have to use the framed transport on
the server. But, if you've enabled framing on the server, you will not
be able to use C# clients (last I checked, there was no framed transport
for C#).

So, if you don't need to make a choice based on client interoperability,
then give them both a try and see which one works best for you (but I'm
guessing that you won't notice a difference).


-- 
Eric Evans
eev...@rackspace.com



Re: framed transport

2010-04-15 Thread Miguel Verde
On Thu, Apr 15, 2010 at 10:22 AM, Eric Evans  wrote:

> But, if you've enabled framing on the server, you will not
> be able to use C# clients (last I checked, there was no framed transport
> for C#).


There *are* many clients that don't have framed transports, but the C#
client had it added in November:
https://issues.apache.org/jira/browse/THRIFT-210


timestamp not found

2010-04-15 Thread Lee Parker
We are currently migrating about 70G of data from mysql to cassandra.  I am
occasionally getting the following error:

Required field 'timestamp' was not found in serialized data! Struct:
Column(name:74 65 78 74, value:44 61 73 20 6C 69 65 62 20 69 63 68 20 76 6F
6E 20 23 49 6E 61 3A 20 68 74 74 70 3A 2F 2F 77 77 77 2E 79 6F 75 74 75 62
65 2E 63 6F 6D 2F 77 61 74 63 68 3F 76 3D 70 75 38 4B 54 77 79 64 56 77 6B
26 66 65 61 74 75 72 65 3D 72 65 6C 61 74 65 64 20 40 70 6A 80 01 00 01 00,
timestamp:0)

The loop which is building out the mutation map for the batch_mutate call is
adding a timestamp to each column.  I have verified that the time stamp is
there for several calls and I feel like if the logic was bad, i would see
the error more frequently.  Does anyone have suggestions as to what may be
causing this?

Lee Parker
l...@spredfast.com

[image: Spredfast]


Re: inserting rows in columns inside a supercolumn

2010-04-15 Thread Miguel Verde
Just to nitpick your representation a little bit, columnB/etc... are
supercolumnB/etc..., key1/etc... are column1/etc..., and you can probably
omit valueA/valueD designations entirely, it would still be understood.

Columns in Cassandra always have timestamps, you can't omit them.

Can you post a snippet of the code you are using and the error you get?

On Thu, Apr 15, 2010 at 5:02 AM, Julio Carlos Barrera Juez <
juliocar...@gmail.com> wrote:

> Hi all,
>
> I'm working with Cassandra 0.5 and Thrift API. I have a simple doubt:
>
> I want to insert a row in columns inside a supercolumn, like this (without
> timestamps):
>
> SuperColumnNameA ==> keyA valueA ==> columnB ==> key1 value1
>
>  ==> key2 value2
>
>  ==> key3 value3
> ==> columnC
> ==> key4 value4
>
>  ==> key5 value5
> ==> keyD valueD ==> columnE ==> *key6
> value6*
>
>  ==> *key7 value7*
> ==> columnF ==>* key8
> value8*
>
>  ==> *key9 value9*
>
> For instance, I want to insert only key-values 6,7,8 and 9, but when I try
> it, I destroy all the others values. What is the correct mode to do it. I
> have tries obtaining the supercolumn and adding more values, bath_insert(),
> etc, but I always failing.
>
> Thank you.
>


Re: timestamp not found

2010-04-15 Thread Mike Malone
Looks like the timestamp, in this case, is 0. Does Cassandra allow zero
timestamps? Could be a bug in Cassandra doing an implicit boolean coercion
in a conditional where it shouldn't.

Mike

On Thu, Apr 15, 2010 at 8:39 AM, Lee Parker  wrote:

> We are currently migrating about 70G of data from mysql to cassandra.  I am
> occasionally getting the following error:
>
> Required field 'timestamp' was not found in serialized data! Struct:
> Column(name:74 65 78 74, value:44 61 73 20 6C 69 65 62 20 69 63 68 20 76 6F
> 6E 20 23 49 6E 61 3A 20 68 74 74 70 3A 2F 2F 77 77 77 2E 79 6F 75 74 75 62
> 65 2E 63 6F 6D 2F 77 61 74 63 68 3F 76 3D 70 75 38 4B 54 77 79 64 56 77 6B
> 26 66 65 61 74 75 72 65 3D 72 65 6C 61 74 65 64 20 40 70 6A 80 01 00 01 00,
> timestamp:0)
>
> The loop which is building out the mutation map for the batch_mutate call
> is adding a timestamp to each column.  I have verified that the time stamp
> is there for several calls and I feel like if the logic was bad, i would see
> the error more frequently.  Does anyone have suggestions as to what may be
> causing this?
>
> Lee Parker
> l...@spredfast.com
>
> [image: Spredfast]
>


Re: timestamp not found

2010-04-15 Thread Lee Parker
When I am verifying the columns in the mutation map before sending it to
cassandra, none of the timestamps are 0.  I have had a difficult time
recreating the error in a controlled environment so I can see the mutation
map that was actually sent.

Lee Parker
l...@spredfast.com

[image: Spredfast]
On Thu, Apr 15, 2010 at 10:45 AM, Mike Malone  wrote:

> Looks like the timestamp, in this case, is 0. Does Cassandra allow zero
> timestamps? Could be a bug in Cassandra doing an implicit boolean coercion
> in a conditional where it shouldn't.
>
> Mike
>
>
> On Thu, Apr 15, 2010 at 8:39 AM, Lee Parker  wrote:
>
>> We are currently migrating about 70G of data from mysql to cassandra.  I
>> am occasionally getting the following error:
>>
>> Required field 'timestamp' was not found in serialized data! Struct:
>> Column(name:74 65 78 74, value:44 61 73 20 6C 69 65 62 20 69 63 68 20 76 6F
>> 6E 20 23 49 6E 61 3A 20 68 74 74 70 3A 2F 2F 77 77 77 2E 79 6F 75 74 75 62
>> 65 2E 63 6F 6D 2F 77 61 74 63 68 3F 76 3D 70 75 38 4B 54 77 79 64 56 77 6B
>> 26 66 65 61 74 75 72 65 3D 72 65 6C 61 74 65 64 20 40 70 6A 80 01 00 01 00,
>> timestamp:0)
>>
>> The loop which is building out the mutation map for the batch_mutate call
>> is adding a timestamp to each column.  I have verified that the time stamp
>> is there for several calls and I feel like if the logic was bad, i would see
>> the error more frequently.  Does anyone have suggestions as to what may be
>> causing this?
>>
>> Lee Parker
>> l...@spredfast.com
>>
>> [image: Spredfast]
>>
>
>


Re: timestamp not found

2010-04-15 Thread Jonathan Ellis
Looks like you are using C++ and not setting the "isset" flag on the
timestamp field, so it's getting the default value for a Java long ("0").

If it works "most of the time" then possibly you are using a Thrift
connection from multiple threads at the same time, which is not safe.

On Thu, Apr 15, 2010 at 10:39 AM, Lee Parker  wrote:

> We are currently migrating about 70G of data from mysql to cassandra.  I am
> occasionally getting the following error:
>
> Required field 'timestamp' was not found in serialized data! Struct:
> Column(name:74 65 78 74, value:44 61 73 20 6C 69 65 62 20 69 63 68 20 76 6F
> 6E 20 23 49 6E 61 3A 20 68 74 74 70 3A 2F 2F 77 77 77 2E 79 6F 75 74 75 62
> 65 2E 63 6F 6D 2F 77 61 74 63 68 3F 76 3D 70 75 38 4B 54 77 79 64 56 77 6B
> 26 66 65 61 74 75 72 65 3D 72 65 6C 61 74 65 64 20 40 70 6A 80 01 00 01 00,
> timestamp:0)
>
> The loop which is building out the mutation map for the batch_mutate call
> is adding a timestamp to each column.  I have verified that the time stamp
> is there for several calls and I feel like if the logic was bad, i would see
> the error more frequently.  Does anyone have suggestions as to what may be
> causing this?
>
> Lee Parker
> l...@spredfast.com
>
> [image: Spredfast]
>


Re: timestamp not found

2010-04-15 Thread Lee Parker
I'm actually using PHP.  I do have several php processes running, but each
one should have it's own Thrift connection.

Lee Parker
l...@spredfast.com

[image: Spredfast]
On Thu, Apr 15, 2010 at 10:53 AM, Jonathan Ellis  wrote:

> Looks like you are using C++ and not setting the "isset" flag on the
> timestamp field, so it's getting the default value for a Java long ("0").
>
> If it works "most of the time" then possibly you are using a Thrift
> connection from multiple threads at the same time, which is not safe.
>
>
> On Thu, Apr 15, 2010 at 10:39 AM, Lee Parker  wrote:
>
>> We are currently migrating about 70G of data from mysql to cassandra.  I
>> am occasionally getting the following error:
>>
>> Required field 'timestamp' was not found in serialized data! Struct:
>> Column(name:74 65 78 74, value:44 61 73 20 6C 69 65 62 20 69 63 68 20 76 6F
>> 6E 20 23 49 6E 61 3A 20 68 74 74 70 3A 2F 2F 77 77 77 2E 79 6F 75 74 75 62
>> 65 2E 63 6F 6D 2F 77 61 74 63 68 3F 76 3D 70 75 38 4B 54 77 79 64 56 77 6B
>> 26 66 65 61 74 75 72 65 3D 72 65 6C 61 74 65 64 20 40 70 6A 80 01 00 01 00,
>> timestamp:0)
>>
>> The loop which is building out the mutation map for the batch_mutate call
>> is adding a timestamp to each column.  I have verified that the time stamp
>> is there for several calls and I feel like if the logic was bad, i would see
>> the error more frequently.  Does anyone have suggestions as to what may be
>> causing this?
>>
>> Lee Parker
>> l...@spredfast.com
>>
>> [image: Spredfast]
>>
>
>


Re: RackAware and replication strategy

2010-04-15 Thread Benjamin Black
Have a look at locator/DatacenterShardStrategy.java.

On Thu, Apr 15, 2010 at 8:16 AM, Ran Tavory  wrote:
> I'm reading this on this
> page http://wiki.apache.org/cassandra/ArchitectureInternals :
>>
>> AbstractReplicationStrategy controls what nodes get secondary, tertiary,
>> etc. replicas of each key range. Primary replica is always determined by the
>> token ring (in TokenMetadata) but you can do a lot of variation with the
>> others. RackUnaware just puts replicas on the next N-1 nodes in the ring.
>> RackAware puts the first non-primary replica in the next node in the ring in
>> ANOTHER data center than the primary; then the remaining replicas in the
>> same as the primary.
>
> So I just want to make sure I got this right and that documentation is up to
> date.
> I have two data centers and rack-aware.
> When replication factor is 2: is it always the case that the primary replica
> goes to one DC and the second replica to the second DC?
> When replication factor is 3: First replica in DC1, second in DC2 and third
> in DC1
> When replication factor is 4: First replica in DC1, second in DC2, third in
> DC1, fourth in DC1 etc
> If I have 4 hosts in each DC, which replication factors make sense?
> N=1 - When I don't care about losing data, cool
> N=2 - When I want to make sure each DC has a copy; useful for local fast
> access and allows recovery if only one host down.
> N=3 - If I want to make sure each DC has a copy plus recovery can be made
> faster in certain cases, and more resilient to two hosts down.
> N=4 - Like N=3 but even more resilient. etc
> Say I want to have two replicas in each DC, can this be done?
>


busy thread on IncomingStreamReader ?

2010-04-15 Thread Ingram Chen
Hi all,

 We setup two nodes and simply set replication factor=2 for test run.

After both nodes, say, node A and node B, serve several hours, we found that
"node A" always keep 300% cpu usage.
(the other node is under 100% cpu, which is normal)

thread dump on "node A" shows that there are 3 busy threads related to
IncomingStreamReader:

==

"Thread-66" prio=10 tid=0x2aade4018800 nid=0x69e7 runnable
[0x4030a000]
   java.lang.Thread.State: RUNNABLE
at sun.misc.Unsafe.setMemory(Native Method)
at sun.nio.ch.Util.erase(Util.java:202)
at
sun.nio.ch.FileChannelImpl.transferFromArbitraryChannel(FileChannelImpl.java:560)
at sun.nio.ch.FileChannelImpl.transferFrom(FileChannelImpl.java:603)
at
org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:62)
at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66)

"Thread-65" prio=10 tid=0x2aade4017000 nid=0x69e6 runnable
[0x4d44b000]
   java.lang.Thread.State: RUNNABLE
at sun.misc.Unsafe.setMemory(Native Method)
at sun.nio.ch.Util.erase(Util.java:202)
at
sun.nio.ch.FileChannelImpl.transferFromArbitraryChannel(FileChannelImpl.java:560)
at sun.nio.ch.FileChannelImpl.transferFrom(FileChannelImpl.java:603)
at
org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:62)
at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66)

"Thread-62" prio=10 tid=0x2aade4014800 nid=0x4150 runnable
[0x4d34a000]
   java.lang.Thread.State: RUNNABLE
at sun.nio.ch.FileChannelImpl.size0(Native Method)
at sun.nio.ch.FileChannelImpl.size(FileChannelImpl.java:309)
- locked <0x2aaac450dcd0> (a java.lang.Object)
at sun.nio.ch.FileChannelImpl.transferFrom(FileChannelImpl.java:597)
at
org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:62)
at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66)

===

Is there anyone experience similar issue ?

environments:

OS   --- CentOS 5.4, Linux 2.6.18-164.15.1.el5 SMP x86_64 GNU/Linux
Java --- build 1.6.0_16-b01, Java HotSpot(TM) 64-Bit Server VM (build
14.2-b01, mixed mode)
Cassandra --- 0.6.0
Node configuration --- node A and node B. both nodes use node A as Seed
client --- Java thrift clients pick one node randomly to do read and write.


-- 
Ingram Chen
online share order: http://dinbendon.net
blog: http://www.javaworld.com.tw/roller/page/ingramchen


Re: BMT flush on windows?

2010-04-15 Thread Sonny Heer
>From the jconsole, I go under
ColumnFamilyStores->CF1->Column1->Operations and clicked force
flush().

I'm getting a "Operation return value" null OK message box.  what am I
doing wrong?


On Tue, Apr 13, 2010 at 3:12 PM, Jonathan Ellis  wrote:
> you have three options
>
> (a) connect with jconsole or another jmx client and invoke flush that way
> (b) run org.apache.cassandra.tools.NodeCmd manually
> (b) write a bat file for NodeCmd like the nodetool shell script in bin/
>
> On Tue, Apr 13, 2010 at 5:08 PM, Sonny Heer  wrote:
>> Is there any way to run a keyspace flush on a windows box?
>>
>


Re: BMT flush on windows?

2010-04-15 Thread Jonathan Ellis
probably because there is nothing to flush.

On Thu, Apr 15, 2010 at 11:53 AM, Sonny Heer  wrote:
> From the jconsole, I go under
> ColumnFamilyStores->CF1->Column1->Operations and clicked force
> flush().
>
> I'm getting a "Operation return value" null OK message box.  what am I
> doing wrong?
>
>
> On Tue, Apr 13, 2010 at 3:12 PM, Jonathan Ellis  wrote:
>> you have three options
>>
>> (a) connect with jconsole or another jmx client and invoke flush that way
>> (b) run org.apache.cassandra.tools.NodeCmd manually
>> (b) write a bat file for NodeCmd like the nodetool shell script in bin/
>>
>> On Tue, Apr 13, 2010 at 5:08 PM, Sonny Heer  wrote:
>>> Is there any way to run a keyspace flush on a windows box?
>>>
>>
>


Re: Recovery from botched compaction

2010-04-15 Thread Jonathan Ellis
On Tue, Apr 13, 2010 at 3:59 PM, Anthony Molinaro
 wrote:
> I actually got lucky and while it hovered in the 91-95% full, compaction
> finished and its now at 60%.  However, I still have around a dozen or so
> data files.  I thought 'nodeprobe compact' did a major compaction, and
> that a major compaction would shrink to one file?

2 possibilities, probably both of which are affecting you:

1. If there isn't enough disk space to compact everything, cassandra
will remove files from the to-compact list until it has room to do
what you asked it to do.  (But, you you can still run out of space if
you write enough data while the compaction happens.)

2. 0.5's minor compactions don't combine as many sstables as they
should automatically.  This is fixed in 0.6

> Okay, sounds good, I may leave it for the moment, as last time I tried
> any sort of move/decommision with 0.5.x I was unable to figure out if
> anything was happening, so I may just wait and revisit when I upgrade.

Yes, 0.5 sucks there.  0.6 is still a little opaque but you can at
least see what is happening if you know where to look:
http://wiki.apache.org/cassandra/Streaming

-Jonathan


Re: batch_mutate silently failing

2010-04-15 Thread Jonathan Ellis
Could you create a ticket for us to return an error message in this
situation?

-Jonathan

On Tue, Apr 13, 2010 at 4:24 PM, Lee Parker  wrote:

> nevermind.  I figured out what the problem was.  I was not putting the
> column inside a ColumnOrSuperColumn container.
>
>
> Lee Parker
> l...@spredfast.com
>
> [image: Spredfast]
> On Tue, Apr 13, 2010 at 4:19 PM, Lee Parker  wrote:
>
>> I upgraded my dev environment to 0.6.0 today in expectation of upgrading
>> our prod environment soon.  I am trying to rewrite some of our code to use
>> batch_mutate with the Thrift PHP library directly.  I'm not getting any
>> result back, not even an exception or failure message, but the data is never
>> showing up in the single node cassandra setup.  Here is a dump of my
>> mutation map:
>>
>> array(1) {
>>   ["testkey"]=>
>>   array(1) {
>> ["StreamItems"]=>
>> array(2) {
>>   [0]=>
>>   object(cassandra_Mutation)#156 (2) {
>> ["column_or_supercolumn"]=>
>> object(cassandra_Column)#157 (3) {
>>   ["name"]=>
>>   string(4) "test"
>>   ["value"]=>
>>   string(14) "this is a test"
>>   ["timestamp"]=>
>>   float(1271193181943.1)
>> }
>> ["deletion"]=>
>> NULL
>>   }
>>   [1]=>
>>   object(cassandra_Mutation)#158 (2) {
>> ["column_or_supercolumn"]=>
>> object(cassandra_Column)#159 (3) {
>>   ["name"]=>
>>   string(5) "test2"
>>   ["value"]=>
>>   string(19) "Another test column"
>>   ["timestamp"]=>
>>   float(1271193181943.2)
>> }
>> ["deletion"]=>
>> NULL
>>   }
>> }
>>   }
>> }
>>
>> When I pass this into client->batch_mutate, nothing seems to happen.  Any
>> ideas about what could be going on?  I have been able to insert data using
>> cassandra-cli without issue.
>>
>> Lee Parker
>> l...@spredfast.com
>>
>> [image: Spredfast]
>>
>
>


Re: batch_mutate silently failing

2010-04-15 Thread Lee Parker
The entire thing was completely my own fault.  I was making an invalid
request and, somewhere in the code, I was catching the exception and not
handling it at all.  So it only appeared to be silent when in reality it was
throwing a nice descriptive exception.

Lee Parker
l...@spredfast.com

[image: Spredfast]
On Thu, Apr 15, 2010 at 12:28 PM, Jonathan Ellis  wrote:

> Could you create a ticket for us to return an error message in this
> situation?
>
> -Jonathan
>
>
> On Tue, Apr 13, 2010 at 4:24 PM, Lee Parker  wrote:
>
>> nevermind.  I figured out what the problem was.  I was not putting the
>> column inside a ColumnOrSuperColumn container.
>>
>>
>> Lee Parker
>> l...@spredfast.com
>>
>> [image: Spredfast]
>> On Tue, Apr 13, 2010 at 4:19 PM, Lee Parker  wrote:
>>
>>> I upgraded my dev environment to 0.6.0 today in expectation of upgrading
>>> our prod environment soon.  I am trying to rewrite some of our code to use
>>> batch_mutate with the Thrift PHP library directly.  I'm not getting any
>>> result back, not even an exception or failure message, but the data is never
>>> showing up in the single node cassandra setup.  Here is a dump of my
>>> mutation map:
>>>
>>> array(1) {
>>>   ["testkey"]=>
>>>   array(1) {
>>> ["StreamItems"]=>
>>> array(2) {
>>>   [0]=>
>>>   object(cassandra_Mutation)#156 (2) {
>>> ["column_or_supercolumn"]=>
>>> object(cassandra_Column)#157 (3) {
>>>   ["name"]=>
>>>   string(4) "test"
>>>   ["value"]=>
>>>   string(14) "this is a test"
>>>   ["timestamp"]=>
>>>   float(1271193181943.1)
>>> }
>>> ["deletion"]=>
>>> NULL
>>>   }
>>>   [1]=>
>>>   object(cassandra_Mutation)#158 (2) {
>>> ["column_or_supercolumn"]=>
>>> object(cassandra_Column)#159 (3) {
>>>   ["name"]=>
>>>   string(5) "test2"
>>>   ["value"]=>
>>>   string(19) "Another test column"
>>>   ["timestamp"]=>
>>>   float(1271193181943.2)
>>> }
>>> ["deletion"]=>
>>> NULL
>>>   }
>>> }
>>>   }
>>> }
>>>
>>> When I pass this into client->batch_mutate, nothing seems to happen.  Any
>>> ideas about what could be going on?  I have been able to insert data using
>>> cassandra-cli without issue.
>>>
>>> Lee Parker
>>> l...@spredfast.com
>>>
>>> [image: Spredfast]
>>>
>>
>>
>


Re: New User: OSX vs. Debian on Cassandra 0.5.0 with Thrift

2010-04-15 Thread Jonathan Ellis
You're right, to get those numbers on debian something is very wrong.

Have you looked at
http://spyced.blogspot.com/2010/01/linux-performance-basics.html ?
What is the bottleneck on the linux machines?

With the kind of speed you are seeing I wouldn't be surprised if it is swapping.

-Jonathan

On Tue, Apr 13, 2010 at 11:38 PM, Heath Oderman  wrote:
> Hi,
> I wrote a few days ago and got a few good suggestions.  I'm still seeing
> dramatic differences between Cassandra 0.5.0 on OSX vs. Debian Linux.
> I've tried on Debian with the Sun JRE and the Open JDK with nearly identical
> results. I've tried a mix of hardware.
> Attached are some graphs I've produced of my results which show that in OSX,
> Cassandra takes longer with a greater load but is wicked fast (expected).
> In the SunJDK or Open JDK on Debian I get amazingly consistent time taken to
> do the writes, regardless of the load and the times are always ridiculously
> high.  It's insanely slow.
> I genuinely believe that I must be doing something very wrong in my Debian
> setups, but they are all vanilla installs, both 64 bit and 32 bit machines,
> 64bit and 32 bit installs.  Cassandra packs taken from
> http://www.apache.org/dist/cassandra/debian.
> I am using Thrift, and I'm using a c# client because that's how I intend to
> actually use Cassandra and it seems pretty sensible.
> An example of what I'm seeing is:
> 5 Threads Each writing 100,000 Simple Entries
> OSX: 1 min 16 seconds ~ 6515 Entries / second
> Debian: 1 hour 15 seconds ~ 138 Records / second
> 15 Threads Each writing 100,000 Simple Entries
> OSX: 2min 30 seconds seconds writing ~10,000 Entries / second
> Debian: 1 hour 1.5 minutes ~406 Entries / second
> 20 Threads Each Writing 100,000 Simple Entries
> OSX: 3min 19 seconds ~ 10,050 Entries / second
> Debian: 1 hour 20 seconds ~ 492 Entries / second
> If anyone has any suggestions or pointers I'd be glad to hear them.
> Thanks,
> Stu
> Attached:
> 1. CassLoadTesting.ods (all my results and graphs in OpenOffice format
> downloaded from Google Docs)
> 2. OSX Records per Second - a graph of how many entries get written per
> second for 10,000 & 100,000 entries as thread count is increased in OSX.
> 3. Open JDK Records per Second - the same graph but of Open JDK on Debian
> 4. Open JDK Total Time By Thread - the total time taken from test start to
> finish (all threads completed) to write 10,000 & 100,000 entries as thread
> count is increased in Debian with Open JDK
> 5. OSX Total time by Thread - same as 4, but for OSX.
>
>


Re: batch_mutate silently failing

2010-04-15 Thread Jonathan Ellis
Ah, I see.  Glad you resolved that. :)

On Thu, Apr 15, 2010 at 12:31 PM, Lee Parker  wrote:

> The entire thing was completely my own fault.  I was making an invalid
> request and, somewhere in the code, I was catching the exception and not
> handling it at all.  So it only appeared to be silent when in reality it was
> throwing a nice descriptive exception.
>
>
> Lee Parker
> l...@spredfast.com
>
> [image: Spredfast]
> On Thu, Apr 15, 2010 at 12:28 PM, Jonathan Ellis wrote:
>
>> Could you create a ticket for us to return an error message in this
>> situation?
>>
>> -Jonathan
>>
>>
>> On Tue, Apr 13, 2010 at 4:24 PM, Lee Parker  wrote:
>>
>>> nevermind.  I figured out what the problem was.  I was not putting the
>>> column inside a ColumnOrSuperColumn container.
>>>
>>>
>>> Lee Parker
>>> l...@spredfast.com
>>>
>>> [image: Spredfast]
>>> On Tue, Apr 13, 2010 at 4:19 PM, Lee Parker wrote:
>>>
 I upgraded my dev environment to 0.6.0 today in expectation of upgrading
 our prod environment soon.  I am trying to rewrite some of our code to use
 batch_mutate with the Thrift PHP library directly.  I'm not getting any
 result back, not even an exception or failure message, but the data is 
 never
 showing up in the single node cassandra setup.  Here is a dump of my
 mutation map:

 array(1) {
   ["testkey"]=>
   array(1) {
 ["StreamItems"]=>
 array(2) {
   [0]=>
   object(cassandra_Mutation)#156 (2) {
 ["column_or_supercolumn"]=>
 object(cassandra_Column)#157 (3) {
   ["name"]=>
   string(4) "test"
   ["value"]=>
   string(14) "this is a test"
   ["timestamp"]=>
   float(1271193181943.1)
 }
 ["deletion"]=>
 NULL
   }
   [1]=>
   object(cassandra_Mutation)#158 (2) {
 ["column_or_supercolumn"]=>
 object(cassandra_Column)#159 (3) {
   ["name"]=>
   string(5) "test2"
   ["value"]=>
   string(19) "Another test column"
   ["timestamp"]=>
   float(1271193181943.2)
 }
 ["deletion"]=>
 NULL
   }
 }
   }
 }

 When I pass this into client->batch_mutate, nothing seems to happen.
  Any ideas about what could be going on?  I have been able to insert data
 using cassandra-cli without issue.

 Lee Parker
 l...@spredfast.com

 [image: Spredfast]

>>>
>>>
>>
>


Re: New User: OSX vs. Debian on Cassandra 0.5.0 with Thrift

2010-04-15 Thread Heath Oderman
Thanks Jonathan, I'll check this out right away.

On Thu, Apr 15, 2010 at 1:32 PM, Jonathan Ellis  wrote:

> You're right, to get those numbers on debian something is very wrong.
>
> Have you looked at
> http://spyced.blogspot.com/2010/01/linux-performance-basics.html ?
> What is the bottleneck on the linux machines?
>
> With the kind of speed you are seeing I wouldn't be surprised if it is
> swapping.
>
> -Jonathan
>
> On Tue, Apr 13, 2010 at 11:38 PM, Heath Oderman 
> wrote:
> > Hi,
> > I wrote a few days ago and got a few good suggestions.  I'm still seeing
> > dramatic differences between Cassandra 0.5.0 on OSX vs. Debian Linux.
> > I've tried on Debian with the Sun JRE and the Open JDK with nearly
> identical
> > results. I've tried a mix of hardware.
> > Attached are some graphs I've produced of my results which show that in
> OSX,
> > Cassandra takes longer with a greater load but is wicked fast (expected).
> > In the SunJDK or Open JDK on Debian I get amazingly consistent time taken
> to
> > do the writes, regardless of the load and the times are always
> ridiculously
> > high.  It's insanely slow.
> > I genuinely believe that I must be doing something very wrong in my
> Debian
> > setups, but they are all vanilla installs, both 64 bit and 32 bit
> machines,
> > 64bit and 32 bit installs.  Cassandra packs taken from
> > http://www.apache.org/dist/cassandra/debian.
> > I am using Thrift, and I'm using a c# client because that's how I intend
> to
> > actually use Cassandra and it seems pretty sensible.
> > An example of what I'm seeing is:
> > 5 Threads Each writing 100,000 Simple Entries
> > OSX: 1 min 16 seconds ~ 6515 Entries / second
> > Debian: 1 hour 15 seconds ~ 138 Records / second
> > 15 Threads Each writing 100,000 Simple Entries
> > OSX: 2min 30 seconds seconds writing ~10,000 Entries / second
> > Debian: 1 hour 1.5 minutes ~406 Entries / second
> > 20 Threads Each Writing 100,000 Simple Entries
> > OSX: 3min 19 seconds ~ 10,050 Entries / second
> > Debian: 1 hour 20 seconds ~ 492 Entries / second
> > If anyone has any suggestions or pointers I'd be glad to hear them.
> > Thanks,
> > Stu
> > Attached:
> > 1. CassLoadTesting.ods (all my results and graphs in OpenOffice format
> > downloaded from Google Docs)
> > 2. OSX Records per Second - a graph of how many entries get written per
> > second for 10,000 & 100,000 entries as thread count is increased in OSX.
> > 3. Open JDK Records per Second - the same graph but of Open JDK on Debian
> > 4. Open JDK Total Time By Thread - the total time taken from test start
> to
> > finish (all threads completed) to write 10,000 & 100,000 entries as
> thread
> > count is increased in Debian with Open JDK
> > 5. OSX Total time by Thread - same as 4, but for OSX.
> >
> >
>


Re: server crash - how to invertigate

2010-04-15 Thread Jonathan Ellis
There's a few things it could be:

Out of memory: usually it can log the exception before dying but not
always.  there will be a java_$pid.hprof file with the heap dumped.

JVM crash: there will be hs_err$pid.log file

OS bug or hardware problem: sometimes your OS will log something

-Jonathan

On Wed, Apr 14, 2010 at 6:04 AM, Ran Tavory  wrote:
> I'm running a 0.6.0 cluster with four nodes and one of them just crashed.
> The logs all seem normal and I haven't seen anything special in the jmx
> counters before the crash.
> I have one client writing and reading using 10 threads and using 3 different
> column families: KvAds, KvImpressions and KvUsers
> the client had got a few UnavailableException, TimedOutException and
> TTransportException but was able to complete the read/write operation by
> failing over to another available host. I can't tell if the exceptions were
> from the crashed host or from other hosts in the ring.
> Any hints how to investigate this are greatly appreciated. So far I'm
> lost...
> Here's a snippet from the log just before it went down. It doesn't seem to
> have anything special in it, everything is INFO level.
> The only thing that seems a bit strange is that last message: Compacting [].
> This message usually comes with things inside the [], such as Compacting
> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassdata/data/system/LocationInfo-1-Data.db'),...]
> but this time it was just empty.
> However, this is not the only place in the log were I see an empty
> Compacting []. There are other places and they didn't end up in a crash, so
> I don't know if it's related.
> here's the log:
>  INFO [ROW-MUTATION-STAGE:6] 2010-04-14 05:55:07,014 ColumnFamilyStore.java
> (line 357) KvImpressions has reached its threshold; switching in a fresh
> Memtable at
> CommitLogContext(file='/outbrain/cassdata/commitlog/CommitLog-1271238432773.log',
> position=68606651)
>  INFO [ROW-MUTATION-STAGE:6] 2010-04-14 05:55:07,015 ColumnFamilyStore.java
> (line 609) Enqueuing flush of Memtable(KvImpressions)@258729366
>  INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:55:07,015 Memtable.java (line 148)
> Writing Memtable(KvImpressions)@258729366
>  INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:55:10,130 Memtable.java (line 162)
> Completed flushing
> /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-24-Data.db
>  INFO [COMMIT-LOG-WRITER] 2010-04-14 05:55:10,154 CommitLog.java (line 407)
> Discarding obsolete commit
> log:CommitLogSegment(/outbrain/cassdata/commitlog/CommitLog-1271238049425.log)
>  INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,415
> SSTableDeletingReference.java (line 104) Deleted
> /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-16-Data.db
>  INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,440
> SSTableDeletingReference.java (line 104) Deleted
> /outbrain/cassdata/data/outbrain_kvdb/KvAds-8-Data.db
>  INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,454
> SSTableDeletingReference.java (line 104) Deleted
> /outbrain/cassdata/data/outbrain_kvdb/KvAds-10-Data.db
>  INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,526
> SSTableDeletingReference.java (line 104) Deleted
> /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-5-Data.db
>  INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,585
> SSTableDeletingReference.java (line 104) Deleted
> /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-11-Data.db
>  INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,602
> SSTableDeletingReference.java (line 104) Deleted
> /outbrain/cassdata/data/outbrain_kvdb/KvAds-11-Data.db
>  INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,614
> SSTableDeletingReference.java (line 104) Deleted
> /outbrain/cassdata/data/outbrain_kvdb/KvAds-9-Data.db
>  INFO [SSTABLE-CLEANUP-TIMER] 2010-04-14 05:55:28,682
> SSTableDeletingReference.java (line 104) Deleted
> /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-21-Data.db
>  INFO [COMMIT-LOG-WRITER] 2010-04-14 05:55:52,254 CommitLogSegment.java
> (line 50) Creating new commitlog segment
> /outbrain/cassdata/commitlog/CommitLog-1271238952254.log
>  INFO [ROW-MUTATION-STAGE:16] 2010-04-14 05:56:25,347 ColumnFamilyStore.java
> (line 357) KvImpressions has reached its threshold; switching in a fresh
> Memtable at
> CommitLogContext(file='/outbrain/cassdata/commitlog/CommitLog-1271238952254.log',
> position=47568158)
>  INFO [ROW-MUTATION-STAGE:16] 2010-04-14 05:56:25,348 ColumnFamilyStore.java
> (line 609) Enqueuing flush of Memtable(KvImpressions)@1955587316
>  INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:56:25,348 Memtable.java (line 148)
> Writing Memtable(KvImpressions)@1955587316
>  INFO [FLUSH-WRITER-POOL:1] 2010-04-14 05:56:30,572 Memtable.java (line 162)
> Completed flushing
> /outbrain/cassdata/data/outbrain_kvdb/KvImpressions-25-Data.db
>  INFO [COMMIT-LOG-WRITER] 2010-04-14 05:57:26,790 CommitLogSegment.java
> (line 50) Creating new commitlog segment
> /outbrain/cassdata/commitlog/CommitLog-1271239046790.log
>  INFO [ROW-MUTATION-STAGE:7] 2010-04-14 05:57:59,513 Co

Re: Reading thousands of columns

2010-04-15 Thread Jonathan Ellis
How long to read just 10 columns?

On Wed, Apr 14, 2010 at 3:19 PM, James Golick  wrote:
> The values are empty. It's 3000 UUIDs.
>
> On Wed, Apr 14, 2010 at 12:40 PM, Avinash Lakshman
>  wrote:
>>
>> How large are the values? How much data on disk?
>>
>> On Wednesday, April 14, 2010, James Golick  wrote:
>> > Just for the record, I am able to repeat this locally.
>> > I'm seeing around 150ms to read 1000 columns from a row that has 3000 in
>> > it. If I enable the rowcache, that goes down to about 90ms. According to my
>> > profile, 90% of the time is being spent waiting for cassandra to respond, 
>> > so
>> > it's not thrift.
>> >
>> > On Wed, Apr 14, 2010 at 11:01 AM, Paul Prescod 
>> > wrote:
>> >
>> > On Wed, Apr 14, 2010 at 10:31 AM, Mike Malone 
>> > wrote:
>> >> ...
>> >>
>> >> Couldn't you cache a list of keys that were returned for the key range,
>> >> then
>> >> cache individual rows separately or not at all?
>> >> By "blowing away rows queried by key" I'm guessing you mean "pushing
>> >> them
>> >> out of the LRU cache," not explicitly blowing them away? Either way I'm
>> >> not
>> >> entirely convinced. In my experience I've had pretty good success
>> >> caching
>> >> items that were pulled out via more complicated join / range type
>> >> queries.
>> >> If your system is doing lots of range quereis, and not a lot of lookups
>> >> by
>> >> key, you'd obviously see a performance win from caching the range
>> >> queries.
>> >> Maybe range scan caching could be turned on separately?
>> >
>> > I agree with you that the caches should be separate, if you're going
>> > to cache ranges. You could imagine a single query (perhaps entered
>> > interactively) would replace the entire row caching all of the data
>> > for the systems' interactive users. For example, a summary page of who
>> > is most over the last month active could replace the profile
>> > information for the actual users who are using the system at that
>> > moment.
>> >
>> >  Paul Prescod
>> >
>> >
>> >
>
>


Re: New User: OSX vs. Debian on Cassandra 0.5.0 with Thrift

2010-04-15 Thread Heath Oderman
So checking it out quickly:

vmstat -

Never swaps.  si and so  stay at 0 during the load.

iostat -x

the %util never climbs above 0.00, but the avgrg-sz jumps bewteen samples
from 0 - 30 - 90 - 0 (5 second intervals)

top shows the cpu barely working and mem utilization is below 20%.

Still slow.  :(

Thanks for the suggestions.  In your article on your blog it'd be awesome to
include some implications, like "avgrg-sz over 250 may mean XXX"  Even if
it's utterly hardware and system dependent it'd give a guy like me an idea
if what I was seeing was bad or good. :D

Thanks again,
Heath


On Thu, Apr 15, 2010 at 1:34 PM, Heath Oderman  wrote:

> Thanks Jonathan, I'll check this out right away.
>
>
> On Thu, Apr 15, 2010 at 1:32 PM, Jonathan Ellis  wrote:
>
>> You're right, to get those numbers on debian something is very wrong.
>>
>> Have you looked at
>> http://spyced.blogspot.com/2010/01/linux-performance-basics.html ?
>> What is the bottleneck on the linux machines?
>>
>> With the kind of speed you are seeing I wouldn't be surprised if it is
>> swapping.
>>
>> -Jonathan
>>
>> On Tue, Apr 13, 2010 at 11:38 PM, Heath Oderman 
>> wrote:
>> > Hi,
>> > I wrote a few days ago and got a few good suggestions.  I'm still seeing
>> > dramatic differences between Cassandra 0.5.0 on OSX vs. Debian Linux.
>> > I've tried on Debian with the Sun JRE and the Open JDK with nearly
>> identical
>> > results. I've tried a mix of hardware.
>> > Attached are some graphs I've produced of my results which show that in
>> OSX,
>> > Cassandra takes longer with a greater load but is wicked fast
>> (expected).
>> > In the SunJDK or Open JDK on Debian I get amazingly consistent time
>> taken to
>> > do the writes, regardless of the load and the times are always
>> ridiculously
>> > high.  It's insanely slow.
>> > I genuinely believe that I must be doing something very wrong in my
>> Debian
>> > setups, but they are all vanilla installs, both 64 bit and 32 bit
>> machines,
>> > 64bit and 32 bit installs.  Cassandra packs taken from
>> > http://www.apache.org/dist/cassandra/debian.
>> > I am using Thrift, and I'm using a c# client because that's how I intend
>> to
>> > actually use Cassandra and it seems pretty sensible.
>> > An example of what I'm seeing is:
>> > 5 Threads Each writing 100,000 Simple Entries
>> > OSX: 1 min 16 seconds ~ 6515 Entries / second
>> > Debian: 1 hour 15 seconds ~ 138 Records / second
>> > 15 Threads Each writing 100,000 Simple Entries
>> > OSX: 2min 30 seconds seconds writing ~10,000 Entries / second
>> > Debian: 1 hour 1.5 minutes ~406 Entries / second
>> > 20 Threads Each Writing 100,000 Simple Entries
>> > OSX: 3min 19 seconds ~ 10,050 Entries / second
>> > Debian: 1 hour 20 seconds ~ 492 Entries / second
>> > If anyone has any suggestions or pointers I'd be glad to hear them.
>> > Thanks,
>> > Stu
>> > Attached:
>> > 1. CassLoadTesting.ods (all my results and graphs in OpenOffice format
>> > downloaded from Google Docs)
>> > 2. OSX Records per Second - a graph of how many entries get written per
>> > second for 10,000 & 100,000 entries as thread count is increased in OSX.
>> > 3. Open JDK Records per Second - the same graph but of Open JDK on
>> Debian
>> > 4. Open JDK Total Time By Thread - the total time taken from test start
>> to
>> > finish (all threads completed) to write 10,000 & 100,000 entries as
>> thread
>> > count is increased in Debian with Open JDK
>> > 5. OSX Total time by Thread - same as 4, but for OSX.
>> >
>> >
>>
>
>


Re: New User: OSX vs. Debian on Cassandra 0.5.0 with Thrift

2010-04-15 Thread Jonathan Ellis
What kind of numbers do you get from contrib/py_stress?

(that's located somewhere else in 0.5, but you should really be using
0.6 anyway.)

On Thu, Apr 15, 2010 at 12:53 PM, Heath Oderman  wrote:
> So checking it out quickly:
> vmstat -
> Never swaps.  si and so  stay at 0 during the load.
> iostat -x
> the %util never climbs above 0.00, but the avgrg-sz jumps bewteen samples
> from 0 - 30 - 90 - 0 (5 second intervals)
> top shows the cpu barely working and mem utilization is below 20%.
> Still slow.  :(
> Thanks for the suggestions.  In your article on your blog it'd be awesome to
> include some implications, like "avgrg-sz over 250 may mean XXX"  Even if
> it's utterly hardware and system dependent it'd give a guy like me an idea
> if what I was seeing was bad or good. :D
> Thanks again,
> Heath
>
> On Thu, Apr 15, 2010 at 1:34 PM, Heath Oderman  wrote:
>>
>> Thanks Jonathan, I'll check this out right away.
>>
>> On Thu, Apr 15, 2010 at 1:32 PM, Jonathan Ellis  wrote:
>>>
>>> You're right, to get those numbers on debian something is very wrong.
>>>
>>> Have you looked at
>>> http://spyced.blogspot.com/2010/01/linux-performance-basics.html ?
>>> What is the bottleneck on the linux machines?
>>>
>>> With the kind of speed you are seeing I wouldn't be surprised if it is
>>> swapping.
>>>
>>> -Jonathan
>>>
>>> On Tue, Apr 13, 2010 at 11:38 PM, Heath Oderman 
>>> wrote:
>>> > Hi,
>>> > I wrote a few days ago and got a few good suggestions.  I'm still
>>> > seeing
>>> > dramatic differences between Cassandra 0.5.0 on OSX vs. Debian Linux.
>>> > I've tried on Debian with the Sun JRE and the Open JDK with nearly
>>> > identical
>>> > results. I've tried a mix of hardware.
>>> > Attached are some graphs I've produced of my results which show that in
>>> > OSX,
>>> > Cassandra takes longer with a greater load but is wicked fast
>>> > (expected).
>>> > In the SunJDK or Open JDK on Debian I get amazingly consistent time
>>> > taken to
>>> > do the writes, regardless of the load and the times are always
>>> > ridiculously
>>> > high.  It's insanely slow.
>>> > I genuinely believe that I must be doing something very wrong in my
>>> > Debian
>>> > setups, but they are all vanilla installs, both 64 bit and 32 bit
>>> > machines,
>>> > 64bit and 32 bit installs.  Cassandra packs taken from
>>> > http://www.apache.org/dist/cassandra/debian.
>>> > I am using Thrift, and I'm using a c# client because that's how I
>>> > intend to
>>> > actually use Cassandra and it seems pretty sensible.
>>> > An example of what I'm seeing is:
>>> > 5 Threads Each writing 100,000 Simple Entries
>>> > OSX: 1 min 16 seconds ~ 6515 Entries / second
>>> > Debian: 1 hour 15 seconds ~ 138 Records / second
>>> > 15 Threads Each writing 100,000 Simple Entries
>>> > OSX: 2min 30 seconds seconds writing ~10,000 Entries / second
>>> > Debian: 1 hour 1.5 minutes ~406 Entries / second
>>> > 20 Threads Each Writing 100,000 Simple Entries
>>> > OSX: 3min 19 seconds ~ 10,050 Entries / second
>>> > Debian: 1 hour 20 seconds ~ 492 Entries / second
>>> > If anyone has any suggestions or pointers I'd be glad to hear them.
>>> > Thanks,
>>> > Stu
>>> > Attached:
>>> > 1. CassLoadTesting.ods (all my results and graphs in OpenOffice format
>>> > downloaded from Google Docs)
>>> > 2. OSX Records per Second - a graph of how many entries get written per
>>> > second for 10,000 & 100,000 entries as thread count is increased in
>>> > OSX.
>>> > 3. Open JDK Records per Second - the same graph but of Open JDK on
>>> > Debian
>>> > 4. Open JDK Total Time By Thread - the total time taken from test start
>>> > to
>>> > finish (all threads completed) to write 10,000 & 100,000 entries as
>>> > thread
>>> > count is increased in Debian with Open JDK
>>> > 5. OSX Total time by Thread - same as 4, but for OSX.
>>> >
>>> >
>>
>
>


Re: New User: OSX vs. Debian on Cassandra 0.5.0 with Thrift

2010-04-15 Thread Heath Oderman
I upgraded to 0.6 yesterday and it's bang on the same.  I'll go read up on
py_stress and give it a try.

On Thu, Apr 15, 2010 at 1:57 PM, Jonathan Ellis  wrote:

> What kind of numbers do you get from contrib/py_stress?
>
> (that's located somewhere else in 0.5, but you should really be using
> 0.6 anyway.)
>
> On Thu, Apr 15, 2010 at 12:53 PM, Heath Oderman 
> wrote:
> > So checking it out quickly:
> > vmstat -
> > Never swaps.  si and so  stay at 0 during the load.
> > iostat -x
> > the %util never climbs above 0.00, but the avgrg-sz jumps bewteen samples
> > from 0 - 30 - 90 - 0 (5 second intervals)
> > top shows the cpu barely working and mem utilization is below 20%.
> > Still slow.  :(
> > Thanks for the suggestions.  In your article on your blog it'd be awesome
> to
> > include some implications, like "avgrg-sz over 250 may mean XXX"  Even if
> > it's utterly hardware and system dependent it'd give a guy like me an
> idea
> > if what I was seeing was bad or good. :D
> > Thanks again,
> > Heath
> >
> > On Thu, Apr 15, 2010 at 1:34 PM, Heath Oderman 
> wrote:
> >>
> >> Thanks Jonathan, I'll check this out right away.
> >>
> >> On Thu, Apr 15, 2010 at 1:32 PM, Jonathan Ellis 
> wrote:
> >>>
> >>> You're right, to get those numbers on debian something is very wrong.
> >>>
> >>> Have you looked at
> >>> http://spyced.blogspot.com/2010/01/linux-performance-basics.html ?
> >>> What is the bottleneck on the linux machines?
> >>>
> >>> With the kind of speed you are seeing I wouldn't be surprised if it is
> >>> swapping.
> >>>
> >>> -Jonathan
> >>>
> >>> On Tue, Apr 13, 2010 at 11:38 PM, Heath Oderman 
> >>> wrote:
> >>> > Hi,
> >>> > I wrote a few days ago and got a few good suggestions.  I'm still
> >>> > seeing
> >>> > dramatic differences between Cassandra 0.5.0 on OSX vs. Debian Linux.
> >>> > I've tried on Debian with the Sun JRE and the Open JDK with nearly
> >>> > identical
> >>> > results. I've tried a mix of hardware.
> >>> > Attached are some graphs I've produced of my results which show that
> in
> >>> > OSX,
> >>> > Cassandra takes longer with a greater load but is wicked fast
> >>> > (expected).
> >>> > In the SunJDK or Open JDK on Debian I get amazingly consistent time
> >>> > taken to
> >>> > do the writes, regardless of the load and the times are always
> >>> > ridiculously
> >>> > high.  It's insanely slow.
> >>> > I genuinely believe that I must be doing something very wrong in my
> >>> > Debian
> >>> > setups, but they are all vanilla installs, both 64 bit and 32 bit
> >>> > machines,
> >>> > 64bit and 32 bit installs.  Cassandra packs taken from
> >>> > http://www.apache.org/dist/cassandra/debian.
> >>> > I am using Thrift, and I'm using a c# client because that's how I
> >>> > intend to
> >>> > actually use Cassandra and it seems pretty sensible.
> >>> > An example of what I'm seeing is:
> >>> > 5 Threads Each writing 100,000 Simple Entries
> >>> > OSX: 1 min 16 seconds ~ 6515 Entries / second
> >>> > Debian: 1 hour 15 seconds ~ 138 Records / second
> >>> > 15 Threads Each writing 100,000 Simple Entries
> >>> > OSX: 2min 30 seconds seconds writing ~10,000 Entries / second
> >>> > Debian: 1 hour 1.5 minutes ~406 Entries / second
> >>> > 20 Threads Each Writing 100,000 Simple Entries
> >>> > OSX: 3min 19 seconds ~ 10,050 Entries / second
> >>> > Debian: 1 hour 20 seconds ~ 492 Entries / second
> >>> > If anyone has any suggestions or pointers I'd be glad to hear them.
> >>> > Thanks,
> >>> > Stu
> >>> > Attached:
> >>> > 1. CassLoadTesting.ods (all my results and graphs in OpenOffice
> format
> >>> > downloaded from Google Docs)
> >>> > 2. OSX Records per Second - a graph of how many entries get written
> per
> >>> > second for 10,000 & 100,000 entries as thread count is increased in
> >>> > OSX.
> >>> > 3. Open JDK Records per Second - the same graph but of Open JDK on
> >>> > Debian
> >>> > 4. Open JDK Total Time By Thread - the total time taken from test
> start
> >>> > to
> >>> > finish (all threads completed) to write 10,000 & 100,000 entries as
> >>> > thread
> >>> > count is increased in Debian with Open JDK
> >>> > 5. OSX Total time by Thread - same as 4, but for OSX.
> >>> >
> >>> >
> >>
> >
> >
>


Re: Time-series data model

2010-04-15 Thread Dan Di Spaltro
This is actually fairly similar to how we store metrics at Cloudkick.
Below has a much more in depth explanation of some of that

https://www.cloudkick.com/blog/2010/mar/02/4_months_with_cassandra/

So we store each natural point in the NumericArchive table.











our keys look like:
.

Anyways, this has been working out very well for us.

2010/4/15 Ted Zlatanov :
> On Thu, 15 Apr 2010 11:27:47 +0200 Jean-Pierre Bergamin  
> wrote:
>
> JB> Am 14.04.2010 15:22, schrieb Ted Zlatanov:
>>> On Wed, 14 Apr 2010 15:02:29 +0200 "Jean-Pierre Bergamin" 
>>>  wrote:
>>>
> JB> The metrics are stored together with a timestamp. The queries we want to
> JB> perform are:
> JB> * The last value of a specific metric of a device
> JB> * The values of a specific metric of a device between two timestamps t1 
> and
> JB> t2
>>>
>>> Make your key "devicename-metricname-MMDD-HHMM" (with whatever time
>>> sharding makes sense to you; I use UTC by-hours and by-day in my
>>> environment).  Then your supercolumn is the collection time as a
>>> LongType and your columns inside the supercolumn can express the metric
>>> in detail (collector agent, detailed breakdown, etc.).
>>>
> JB> Just for my understanding. What is "time sharding"? I couldn't find an
> JB> explanation somewhere. Do you mean that the time-series data is rolled
> JB> up in 5 minues, 1 hour, 1 day etc. slices?
>
> Yes.  The usual meaning of "shard" in RDBMS world is to segment your
> database by some criteria, e.g. US vs. Europe in Amazon AWS because
> their data centers are laid out so.  I was taking a linguistic shortcut
> to mean "break down your rows by some convenient criteria."  You can
> actually set up your Partitioner in Cassandra to literally shard your
> keyspace rows based on the key, but I just meant "slice" in my note.
>
> JB> So this would be defined as:
> JB>  JB> CompareWith="UTF8Type"  CompareSubcolumnsWith="LongType" />
>
> JB> So when i want to read all values of one metric between two timestamps
> JB> t0 and t1, I'd have to read the supercolumns that match a key range
> JB> (device1:metric1:t0 - device1:metric1:t1) and then all the
> JB> supercolumns for this key?
>
> Yes.  This is a single multiget if you can construct the key range
> explicitly.  Cassandra loads a lot of this in memory already and filters
> it after the fact, that's why it pays to slice your keys and to stitch
> them together on the client side if you have to go across a time
> boundary.  You'll also get better key load balancing with deeper slicing
> if you use the randomizing partitioner.
>
> In the result set, you'll get each matching supercolumn with all the
> columns inside it.  You may have to page through supercolumns.
>
> Ted
>
>



-- 
Dan Di Spaltro


Re: framed transport

2010-04-15 Thread Nathan McCall
FWIW, We just exposed this as an option in hector.

-Nate

On Thu, Apr 15, 2010 at 8:38 AM, Miguel Verde  wrote:
> On Thu, Apr 15, 2010 at 10:22 AM, Eric Evans  wrote:
>>
>> But, if you've enabled framing on the server, you will not
>> be able to use C# clients (last I checked, there was no framed transport
>> for C#).
>
>
> There *are* many clients that don't have framed transports, but the C#
> client had it added in November:
> https://issues.apache.org/jira/browse/THRIFT-210


Re: BMT flush on windows?

2010-04-15 Thread Sonny Heer
Hmmm. Same code runs on ubuntu, and I'm able to flush using the nodetool.

What is the difference between inserting data using :
StorageProxy.mutateBlocking vs. sending oneway message using the
MessagingService?

On Thu, Apr 15, 2010 at 10:14 AM, Jonathan Ellis  wrote:
> probably because there is nothing to flush.
>
> On Thu, Apr 15, 2010 at 11:53 AM, Sonny Heer  wrote:
>> From the jconsole, I go under
>> ColumnFamilyStores->CF1->Column1->Operations and clicked force
>> flush().
>>
>> I'm getting a "Operation return value" null OK message box.  what am I
>> doing wrong?
>>
>>
>> On Tue, Apr 13, 2010 at 3:12 PM, Jonathan Ellis  wrote:
>>> you have three options
>>>
>>> (a) connect with jconsole or another jmx client and invoke flush that way
>>> (b) run org.apache.cassandra.tools.NodeCmd manually
>>> (b) write a bat file for NodeCmd like the nodetool shell script in bin/
>>>
>>> On Tue, Apr 13, 2010 at 5:08 PM, Sonny Heer  wrote:
 Is there any way to run a keyspace flush on a windows box?

>>>
>>
>


Re: BMT flush on windows?

2010-04-15 Thread Sonny Heer
If I use Storage.mutateBlocking, and hit force flush from jconsole, it
flushes but with this error message:

"Problem invoking forceFlush: java.rmi.UnmarshalExecption: error
unmarshalling return; nested exception is:
java.io.WriteAbortedException: writing aborted;
java.io.NotSerializableException: java.util.concurrent.FutureTask"

This may be known issue, just thought I'd pass it along.  Not sure why
using code from CassandraBulkLoader to send messages isn't working.
I'm using Cassandra .6rc1.

Thanks.

On Thu, Apr 15, 2010 at 11:19 AM, Sonny Heer  wrote:
> Hmmm. Same code runs on ubuntu, and I'm able to flush using the nodetool.
>
> What is the difference between inserting data using :
> StorageProxy.mutateBlocking vs. sending oneway message using the
> MessagingService?
>
> On Thu, Apr 15, 2010 at 10:14 AM, Jonathan Ellis  wrote:
>> probably because there is nothing to flush.
>>
>> On Thu, Apr 15, 2010 at 11:53 AM, Sonny Heer  wrote:
>>> From the jconsole, I go under
>>> ColumnFamilyStores->CF1->Column1->Operations and clicked force
>>> flush().
>>>
>>> I'm getting a "Operation return value" null OK message box.  what am I
>>> doing wrong?
>>>
>>>
>>> On Tue, Apr 13, 2010 at 3:12 PM, Jonathan Ellis  wrote:
 you have three options

 (a) connect with jconsole or another jmx client and invoke flush that way
 (b) run org.apache.cassandra.tools.NodeCmd manually
 (b) write a bat file for NodeCmd like the nodetool shell script in bin/

 On Tue, Apr 13, 2010 at 5:08 PM, Sonny Heer  wrote:
> Is there any way to run a keyspace flush on a windows box?
>

>>>
>>
>


Re: timestamp not found

2010-04-15 Thread Lee Parker
I have done more error checking and I am relatively certain that I am
sending a valid timestamp to the thrift library.  I was testing a switch to
the Framed Transport instead of Buffered Transport and I am getting fewer
errors, but now the cassandra server dies when this happens.  It is starting
to feel like this is a bug in Thrift or the Cassandra Thrift interface.  Can
anyone offer any other insight?  I'm using the current stable release of
Thrift 0.2.0, and Cassandra 0.6.0.

It seems to happen more under heavy load. I don't know if that is meaningful
or not.

Lee Parker

On Thu, Apr 15, 2010 at 11:00 AM, Lee Parker  wrote:

> I'm actually using PHP.  I do have several php processes running, but each
> one should have it's own Thrift connection.
>
>
> Lee Parker
> l...@spredfast.com
>
> [image: Spredfast]
> On Thu, Apr 15, 2010 at 10:53 AM, Jonathan Ellis wrote:
>
>> Looks like you are using C++ and not setting the "isset" flag on the
>> timestamp field, so it's getting the default value for a Java long ("0").
>>
>> If it works "most of the time" then possibly you are using a Thrift
>> connection from multiple threads at the same time, which is not safe.
>>
>>
>> On Thu, Apr 15, 2010 at 10:39 AM, Lee Parker wrote:
>>
>>> We are currently migrating about 70G of data from mysql to cassandra.  I
>>> am occasionally getting the following error:
>>>
>>> Required field 'timestamp' was not found in serialized data! Struct:
>>> Column(name:74 65 78 74, value:44 61 73 20 6C 69 65 62 20 69 63 68 20 76 6F
>>> 6E 20 23 49 6E 61 3A 20 68 74 74 70 3A 2F 2F 77 77 77 2E 79 6F 75 74 75 62
>>> 65 2E 63 6F 6D 2F 77 61 74 63 68 3F 76 3D 70 75 38 4B 54 77 79 64 56 77 6B
>>> 26 66 65 61 74 75 72 65 3D 72 65 6C 61 74 65 64 20 40 70 6A 80 01 00 01 00,
>>> timestamp:0)
>>>
>>> The loop which is building out the mutation map for the batch_mutate call
>>> is adding a timestamp to each column.  I have verified that the time stamp
>>> is there for several calls and I feel like if the logic was bad, i would see
>>> the error more frequently.  Does anyone have suggestions as to what may be
>>> causing this?
>>>
>>> Lee Parker
>>> l...@spredfast.com
>>>
>>> [image: Spredfast]
>>>
>>
>>
>


json2sstable

2010-04-15 Thread Lee Parker
Has anyone used json2sstable to migrate a large amount of data into
cassandra?  What was your methodology?  I assume that this will be much
faster than stepping through my data and doing writes via PHP/Thrift.

Lee Parker


Re: framed transport

2010-04-15 Thread Lee Parker
It appears that after some testing, the buffered transport seems more
stable.  I am occasionally getting a missing timestamp error during
batch_mutate calls.  It happens both on framed and buffered transports, but
when it happens on a framed transport, the server crashes.  Is this typical?

Lee Parker
On Thu, Apr 15, 2010 at 1:12 PM, Nathan McCall wrote:

> FWIW, We just exposed this as an option in hector.
>
> -Nate
>
> On Thu, Apr 15, 2010 at 8:38 AM, Miguel Verde 
> wrote:
> > On Thu, Apr 15, 2010 at 10:22 AM, Eric Evans 
> wrote:
> >>
> >> But, if you've enabled framing on the server, you will not
> >> be able to use C# clients (last I checked, there was no framed transport
> >> for C#).
> >
> >
> > There *are* many clients that don't have framed transports, but the C#
> > client had it added in November:
> > https://issues.apache.org/jira/browse/THRIFT-210
>


Re: framed transport

2010-04-15 Thread Jonathan Ellis
Have you tried other client machines?

It sounds like your client is generating garbage, which is Bad.

https://issues.apache.org/jira/browse/THRIFT-601

On Thu, Apr 15, 2010 at 4:20 PM, Lee Parker  wrote:
> It appears that after some testing, the buffered transport seems more
> stable.  I am occasionally getting a missing timestamp error during
> batch_mutate calls.  It happens both on framed and buffered transports, but
> when it happens on a framed transport, the server crashes.  Is this typical?
>
> Lee Parker
>
> On Thu, Apr 15, 2010 at 1:12 PM, Nathan McCall 
> wrote:
>>
>> FWIW, We just exposed this as an option in hector.
>>
>> -Nate
>>
>> On Thu, Apr 15, 2010 at 8:38 AM, Miguel Verde 
>> wrote:
>> > On Thu, Apr 15, 2010 at 10:22 AM, Eric Evans 
>> > wrote:
>> >>
>> >> But, if you've enabled framing on the server, you will not
>> >> be able to use C# clients (last I checked, there was no framed
>> >> transport
>> >> for C#).
>> >
>> >
>> > There *are* many clients that don't have framed transports, but the C#
>> > client had it added in November:
>> > https://issues.apache.org/jira/browse/THRIFT-210
>
>


Data model question - column names sort

2010-04-15 Thread Sonny Heer
Need a way to have two different types of indexes.

Key: aTextKey
ColumnName: aTextColumnName:55
Value: ""

Key: aTextKey
ColumnName: 55:aTextColumnName
Value: ""

All the valuable information is stored in the column name itself.
Above two can be in different column families...

Queries:
Given a key, page me a list of numerical values sorted on aTextColumnName
Given a key, page me a list of text values sorted on a numerical value

This approach would require left padding the numeric value for the
second index so cassandra can sort on column names correctly.

Is there any other way to accomplish this?


Clarification on Ring operations in Cassandra 0.5.1

2010-04-15 Thread Anthony Molinaro
Hi,

  I have a cluster running on ec2, and would like to do some ring
management.  Specifically, I'd like to replace an existing node
without another node (I want to change the instance type).

  I was looking over http://wiki.apache.org/cassandra/Operations
and it seems like I could do something like.

1) shutdown cassandra on instance I want to replace
2) create a new instance, start cassandra with AutoBootstrap = true
3) run nodeprobe removetoken against the token of the instance I am
   replacing

Then according to the 'Handling failure' the new instance will "find the
appropriate position automatically".  However, it's not clear to me
if this means it will take the same range as the shutdown node or not,
because normally AutoBootstrap == true means it will take "half the keys
from the node with the most disk space used." (from the 'Bootstrap' section).

So will the process I describe above result in what I want, a new node
replacing an old one?

Also, if the new instance takes over the range of the old instance how
does removetoken know which instance to remove, does it remove the Down
instance?

Another hopefully minor question, if I bring up a new node with
AutoBootstrap = false, what happens?
Does it join the ring but without data and without token range?
Can I then 'nodeprobe move ', and
achieve the same as step 2 above?

Thanks,

-Anthony

-- 

Anthony Molinaro   


Re: Lucandra or some way to query

2010-04-15 Thread malsmith
We looking into migrating from a replicated solr infrastructure to some
form of clustered approach.  Lucandra looks fantastic -- but this
statement is troubling:

"No normalizations are stored (no scoring)"  from
http://github.com/tjake/Lucandra

When I use the demo/samples get do get a relevance score, can anyone
describe (or give a scenario) when this limitation would become a
problem?

Thanks in advance.

On Thu, 2010-04-15 at 05:16 +, Jake Luciani wrote:

> Lucandra spreads the data randomly by index + field combination so you
> do get "some" distribution for free. Otherwise you can use "nodetool
> loadbalance" to alter the token ring to alleviate hotspots.
> 
> On Thu, Apr 15, 2010 at 2:04 AM, HubertChang  wrote:
> 
> 
> If you worked with Lucandra in a dedicated searching-purposed
> cluster, you
> could balanced the data very well with some effort. 
> 
> >>I think Lucandra is really a great idea, but since it needs
> order-preserving-partitioner, does that mean >>there may be
> some 'hot-spot'
> during searching?
> 
> --
> View this message in context:
> 
> http://n2.nabble.com/Lucandra-or-some-way-to-query-tp4900727p4905149.html
> Sent from the cassandra-u...@incubator.apache.org mailing list
> archive at Nabble.com. 
> 
> 




Re: Is that possible to write a file system over Cassandra?

2010-04-15 Thread Jeff Zhang
Jonathan,

Previously we use the cassandra-0.6, but we'd like to leverage the hector
java client since it has more advanced features. And hector currently only
support cassandra-0.5.
Why you think using casandra-0.5 is a stange way to do it ? Is cassandra-0.6
incompatibility with cassandra-0.5 ? The migration to cassandra-0.6 will
cost much ?


On Thu, Apr 15, 2010 at 11:50 AM, Jonathan Ellis  wrote:

> You forked Cassandra 0.5 for that?
>
> That's... a strange way to do it.
>
> On Wed, Apr 14, 2010 at 9:36 PM, Jeff Zhang  wrote:
> > We are currently doing such things, and now we are still at the start
> stage.
> > Currently we only plan to store small files. For large files, splitting
> to
> > small blocks is really one of our options.
> > You can check out from here http://code.google.com/p/cassandra-fs/
> >
> > Document for this project is lack now, but still welcome any feedback and
> > contribution.
> >
> >
> >
> > On Wed, Apr 14, 2010 at 7:32 PM, Miguel Verde 
> > wrote:
> >>
> >> On Wed, Apr 14, 2010 at 9:26 PM, Avinash Lakshman
> >>  wrote:
> >>>
> >>> OPP is not required here. You would be better off using a Random
> >>> partitioner because you want to get a random distribution of the
> metadata.
> >>
> >>
> >> Not required, certainly.  However, it strikes me that 1 cluster is
> better
> >> than 2, and most consumers of a filesystem would expect to be able to
> get an
> >> ordered listing or tree of the metadata which is easy using the OPP row
> key
> >> pattern listed previously.  You could still do this with the Random
> >> partitioner using column names in rows to describe the structure but the
> >> current compaction limitations could be an issue if a branch becomes too
> >> large, and you'd still have a root row hotspot (at least in the schema
> which
> >> comes to mind).
> >
> >
> > --
> > Best Regards
> >
> > Jeff Zhang
> >
>



-- 
Best Regards

Jeff Zhang


Is it possible to get all records in a CF?

2010-04-15 Thread Jared Laprise
If you do not have the key for SuperColumn in a ColumnFamily is it not possible 
to browse all the data in the ColumnFamily? Thus far I've only been able to 
find a way to pull out data if I know the key.



Re: Is it possible to get all records in a CF?

2010-04-15 Thread Gary Dusbabek
You'll have to scan the CF.  If you're using
OrderPreservingPartitioner please see 'get_range_slices'
(http://wiki.apache.org/cassandra/API).  It would help if you had an
idea of where the key might be, so you would know where to start
scanning.

Gary.

On Thu, Apr 15, 2010 at 21:01, Jared Laprise  wrote:
> If you do not have the key for SuperColumn in a ColumnFamily is it not
> possible to browse all the data in the ColumnFamily? Thus far I’ve only been
> able to find a way to pull out data if I know the key.
>
>


Re: frequent "unknown result" errors

2010-04-15 Thread Michael Pearson
Lee, I dropped (official) 0.5 support from Pandra yesterday and
committed 0.6 Thrift files, if you're still considering that
upgrade... worth a shot imo.

-michael

On Tue, Apr 13, 2010 at 7:19 AM, Lee Parker  wrote:
> So, it didn't get rid of the problem, i'm still getting the errors.  The
> only thing I can think of now is top upgrade to 0.6, but I would prefer to
> stay with the current stable release.  I have regenerated the thrift code
> for 0.5.0 and there is no difference between those files and the ones i'm
> using in my software now.  Are there any other suggestions?  What code would
> be helpful to see?
> Lee
>
> On Mon, Apr 12, 2010 at 1:17 PM, Keith Thornhill  wrote:
>>
>> i also noticed "unknown result" errors when my php thrift code was
>> generated using a different version of thrift than cassandra uses.
>>
>> after regenerating my php code from thrift-r917130 (for
>> cassandra-0.6.0-rc1), the errors stopped.
>>
>> -keith
>>
>> On Mon, Apr 12, 2010 at 9:40 AM, vineet daniel 
>> wrote:
>> > can you post the code
>> >
>> > On Mon, Apr 12, 2010 at 9:22 PM, Lee Parker 
>> > wrote:
>> >>
>> >> According to his docs, he says you need Cassandra >= 0.5.0.  I guess it
>> >> is
>> >> possible that the included thrift files are targeted at 0.6, but I
>> >> don't see
>> >> the "batch_mutate" method which is part of 0.6.  So I'm assuming that
>> >> it
>> >> should work fine with 0.5.0.
>> >> I have now changed some of those entries in the configs and I have not
>> >> seen the error in a while.  So, it may have simply been that I was
>> >> trying to
>> >> do a query which was too large for the configured buffer to handle.
>> >> For the time being, I would like to stick with 0.5 as it is the
>> >> "stable"
>> >> release and we are running this in a production environment.
>> >>
>> >> Lee Parker
>> >> On Mon, Apr 12, 2010 at 10:45 AM, Jonathan Ellis 
>> >> wrote:
>> >>>
>> >>> Pandra is probably targetting 0.6.
>> >>>
>> >>> If you're just starting, there's no reason for you not to use 0.6 over
>> >>> 0.5 now.
>> >>>
>> >>> On Mon, Apr 12, 2010 at 10:42 AM, Lee Parker 
>> >>> wrote:
>> >>> > I'm using the thrift client which is packaged with Pandra and my
>> >>> > cassandra
>> >>> > version is 0.5.0 which is in the debian packages.  How can i tell
>> >>> > which
>> >>> > version of Thrift i'm using?
>> >>> > Lee
>> >>> >
>> >>> > On Mon, Apr 12, 2010 at 10:30 AM, Jonathan Ellis 
>> >>> > wrote:
>> >>> >>
>> >>> >> Then you're probably using a client incompatible with the server
>> >>> >> version you're using.
>> >>> >>
>> >>> >> On Mon, Apr 12, 2010 at 10:24 AM, Lee Parker 
>> >>> >> wrote:
>> >>> >> > If the connections are being made by individual PHP processes
>> >>> >> > running
>> >>> >> > from
>> >>> >> > the command line, they shouldn't be using the same connection.
>> >>> >> >  Should
>> >>> >> > my
>> >>> >> > code close the connections after each query and open a new one?
>> >>> >> > Here is the flow of what is happening when we get the error:
>> >>> >> > 1. Get a set of items from remote API
>> >>> >> > 2. Insert all of the items into the items CF. (usually anywhere
>> >>> >> > from
>> >>> >> > 2 -
>> >>> >> > 200
>> >>> >> > items)
>> >>> >> > 3. Query the correct index for all entries within a particular
>> >>> >> > time
>> >>> >> > frame
>> >>> >> > (which is determined by the timeframe of the results of step 1)
>> >>> >> > 4. Compare keys in index to keys of items inserted in step 2.
>> >>> >> > 5. Insert new index columns for items which aren't already in the
>> >>> >> > index.
>> >>> >> > I am getting the "unknown result" error during step 3.
>> >>> >> > Lee
>> >>> >> >
>> >>> >> > On Mon, Apr 12, 2010 at 10:05 AM, Jonathan Ellis
>> >>> >> > 
>> >>> >> > wrote:
>> >>> >> >>
>> >>> >> >> unknown result means thrift is badly confused.  You will get
>> >>> >> >> this
>> >>> >> >> when
>> >>> >> >> using the same thrift connection from multiple threads, for
>> >>> >> >> instance.
>> >>> >> >>
>> >>> >> >> On Mon, Apr 12, 2010 at 10:02 AM, Lee Parker
>> >>> >> >> 
>> >>> >> >> wrote:
>> >>> >> >> > I am a newbie with Cassandra.  We are currently migrating a
>> >>> >> >> > large
>> >>> >> >> > amount
>> >>> >> >> > of
>> >>> >> >> > data out of MySQL into Cassandra.  I have two ColumnFamilies.
>> >>> >> >> >  One
>> >>> >> >> > contains
>> >>> >> >> > one row per item and each item has roughly 12 columns.  These
>> >>> >> >> > are
>> >>> >> >> > items
>> >>> >> >> > from
>> >>> >> >> > REST APIs like the Twitter API.  Then I have a second
>> >>> >> >> > ColumnFamily
>> >>> >> >> > with
>> >>> >> >> > very
>> >>> >> >> > large rows and TimeUUID column names which contain the key of
>> >>> >> >> > the
>> >>> >> >> > items
>> >>> >> >> > in
>> >>> >> >> > the other ColumnFamily.  So one ColumnFamily has lots of rows
>> >>> >> >> > with a
>> >>> >> >> > low
>> >>> >> >> > number of columns per row, and the other has relatively few
>> >>> >> >> > rows
>> >>> >> >> > with
>> >>> >> >> > a
>> >>> >> >> 

Re: json2sstable

2010-04-15 Thread 孔令华
I tried that and found that it cannot handle large file at present.
But you can write a tool according to it.
eg: first sorting your data file according to it's hash key; second, write
to a SSTable directly

On Fri, Apr 16, 2010 at 4:47 AM, Lee Parker  wrote:

> Has anyone used json2sstable to migrate a large amount of data into
> cassandra?  What was your methodology?  I assume that this will be much
> faster than stepping through my data and doing writes via PHP/Thrift.
>
> Lee Parker
>


Re: json2sstable

2010-04-15 Thread Brandon Williams
On Thu, Apr 15, 2010 at 3:47 PM, Lee Parker  wrote:

> Has anyone used json2sstable to migrate a large amount of data into
> cassandra?  What was your methodology?  I assume that this will be much
> faster than stepping through my data and doing writes via PHP/Thrift.


If you're looking to do a bulk import, peek at contrib/bmt_example.

-Brandon


Re: Is that possible to write a file system over Cassandra?

2010-04-15 Thread Nathan McCall
In regards to hector, please check all the available branches on
github. We have supported 0.6 for a little while now.

http://github.com/rantav/hector/tree/0.6.0

The master is still based on 0.5, but that is changing in the next
couple of days to match the 0.6 release.

-Nate




On Thu, Apr 15, 2010 at 6:35 PM, Jeff Zhang  wrote:
> Jonathan,
>
> Previously we use the cassandra-0.6, but we'd like to leverage the hector
> java client since it has more advanced features. And hector currently only
> support cassandra-0.5.
> Why you think using casandra-0.5 is a stange way to do it ? Is cassandra-0.6
> incompatibility with cassandra-0.5 ? The migration to cassandra-0.6 will
> cost much ?
>
>
> On Thu, Apr 15, 2010 at 11:50 AM, Jonathan Ellis  wrote:
>>
>> You forked Cassandra 0.5 for that?
>>
>> That's... a strange way to do it.
>>
>> On Wed, Apr 14, 2010 at 9:36 PM, Jeff Zhang  wrote:
>> > We are currently doing such things, and now we are still at the start
>> > stage.
>> > Currently we only plan to store small files. For large files, splitting
>> > to
>> > small blocks is really one of our options.
>> > You can check out from here http://code.google.com/p/cassandra-fs/
>> >
>> > Document for this project is lack now, but still welcome any feedback
>> > and
>> > contribution.
>> >
>> >
>> >
>> > On Wed, Apr 14, 2010 at 7:32 PM, Miguel Verde 
>> > wrote:
>> >>
>> >> On Wed, Apr 14, 2010 at 9:26 PM, Avinash Lakshman
>> >>  wrote:
>> >>>
>> >>> OPP is not required here. You would be better off using a Random
>> >>> partitioner because you want to get a random distribution of the
>> >>> metadata.
>> >>
>> >>
>> >> Not required, certainly.  However, it strikes me that 1 cluster is
>> >> better
>> >> than 2, and most consumers of a filesystem would expect to be able to
>> >> get an
>> >> ordered listing or tree of the metadata which is easy using the OPP row
>> >> key
>> >> pattern listed previously.  You could still do this with the Random
>> >> partitioner using column names in rows to describe the structure but
>> >> the
>> >> current compaction limitations could be an issue if a branch becomes
>> >> too
>> >> large, and you'd still have a root row hotspot (at least in the schema
>> >> which
>> >> comes to mind).
>> >
>> >
>> > --
>> > Best Regards
>> >
>> > Jeff Zhang
>> >
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>


Re: Is that possible to write a file system over Cassandra?

2010-04-15 Thread Jeff Zhang
Thanks, Nathan.



On Fri, Apr 16, 2010 at 12:04 PM, Nathan McCall wrote:

> In regards to hector, please check all the available branches on
> github. We have supported 0.6 for a little while now.
>
> http://github.com/rantav/hector/tree/0.6.0
>
> The master is still based on 0.5, but that is changing in the next
> couple of days to match the 0.6 release.
>
> -Nate
>
>
>
>
> On Thu, Apr 15, 2010 at 6:35 PM, Jeff Zhang  wrote:
> > Jonathan,
> >
> > Previously we use the cassandra-0.6, but we'd like to leverage the hector
> > java client since it has more advanced features. And hector currently
> only
> > support cassandra-0.5.
> > Why you think using casandra-0.5 is a stange way to do it ? Is
> cassandra-0.6
> > incompatibility with cassandra-0.5 ? The migration to cassandra-0.6 will
> > cost much ?
> >
> >
> > On Thu, Apr 15, 2010 at 11:50 AM, Jonathan Ellis 
> wrote:
> >>
> >> You forked Cassandra 0.5 for that?
> >>
> >> That's... a strange way to do it.
> >>
> >> On Wed, Apr 14, 2010 at 9:36 PM, Jeff Zhang  wrote:
> >> > We are currently doing such things, and now we are still at the start
> >> > stage.
> >> > Currently we only plan to store small files. For large files,
> splitting
> >> > to
> >> > small blocks is really one of our options.
> >> > You can check out from here http://code.google.com/p/cassandra-fs/
> >> >
> >> > Document for this project is lack now, but still welcome any feedback
> >> > and
> >> > contribution.
> >> >
> >> >
> >> >
> >> > On Wed, Apr 14, 2010 at 7:32 PM, Miguel Verde <
> miguelitov...@gmail.com>
> >> > wrote:
> >> >>
> >> >> On Wed, Apr 14, 2010 at 9:26 PM, Avinash Lakshman
> >> >>  wrote:
> >> >>>
> >> >>> OPP is not required here. You would be better off using a Random
> >> >>> partitioner because you want to get a random distribution of the
> >> >>> metadata.
> >> >>
> >> >>
> >> >> Not required, certainly.  However, it strikes me that 1 cluster is
> >> >> better
> >> >> than 2, and most consumers of a filesystem would expect to be able to
> >> >> get an
> >> >> ordered listing or tree of the metadata which is easy using the OPP
> row
> >> >> key
> >> >> pattern listed previously.  You could still do this with the Random
> >> >> partitioner using column names in rows to describe the structure but
> >> >> the
> >> >> current compaction limitations could be an issue if a branch becomes
> >> >> too
> >> >> large, and you'd still have a root row hotspot (at least in the
> schema
> >> >> which
> >> >> comes to mind).
> >> >
> >> >
> >> > --
> >> > Best Regards
> >> >
> >> > Jeff Zhang
> >> >
> >
> >
> >
> > --
> > Best Regards
> >
> > Jeff Zhang
> >
>



-- 
Best Regards

Jeff Zhang


Regarding Cassandra Scalability

2010-04-15 Thread Linton N
hi ,
 I am working for the past 1 year with hadoop, but quite new to
cassandra, I would like to get clarified few things regarding the
scalability of Cassandra. Can it scall up to TB of data ?

Please provide me some links regarding this..


-- 
--
With Love
 Lin N