> First time I tun single instance of Cassandra and my application on a system
> (16GB ram and 8 core), the time taken was 480sec.
> When I added one more system ,(means this time I was running 2 instance
> of Cassandra in cluster) and running application from single client , I
> found time taken i
> $ iostat
As rcoli already mentioned you don't seen to have an I/O problem, but
as a point of general recommendation: When determining whether you are
blocking on disk I/O, pretty much *always* use "iostat -x" rather than
the much less useful default mode of iostat. The %util and queue
wait/avera
Thanks a lot for the info
Sebastien
On 2 February 2011 16:53, Jonathan Ellis wrote:
> On Wed, Feb 2, 2011 at 7:37 AM, Sébastien Druon
> wrote:
> > Hi!
> > I would like to know if secondary indexes are foreseen for super columns
> /
> > columns inside of super columns?
>
> No.
>
> > If yes, wil
>
> Thanks. Yes I know it's by no means trivial. I thought in case there was an
> index on the column on which I want to place condition, the index machinery
> itself can do the counting (i.e. when the index is updated, the counter is
> incremented). It doesn't seem too orthogonal to the current im
Hi Peter,
Thanks for your reply.
Our application is multi-threaded. we are using 8 core machine. In our
application we are using 4 column families out of which one column family is
containing rows whose size is huge relative to size of the rows in other
column families.
In the ring the balance i
The affected versions are listed as 0.6.10 and 7.1, it affects get_range_slice
at quorum
https://issues.apache.org/jira/browse/CASSANDRA-2094 impacts 0.7.1 and and will
break QUORUM reads where RF > 3 for get_slice()
AFAIK it's not in 0.7 , and 0.7.1 is not released yet.
Aaron
On 3/02/201
It's in the src distro http://cassandra.apache.org/download/
Aaron
On 3/02/2011, at 12:27 PM, buddhasystem wrote:
>
> Never mind, I found it in SVN...
> (not in gz)
>
> Thanks.
>
> --
> View this message in context:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Slow-net
Thanks Tyler!
On Thu, Feb 3, 2011 at 12:06 PM, Tyler Hobbs wrote:
> On Wed, Feb 2, 2011 at 3:27 PM, Aditya Narayan wrote:
>>
>> Can I have some more feedback about my schema perhaps somewhat more
>> criticisive/harsh ?
>
> It sounds reasonable to me.
>
> Since you're writing/reading all of the
This page has a guide to setting the initial tokens for the nodes
http://wiki.apache.org/cassandra/Operations#Ring_management
You can also use the bin/nodetool cfstats command or JConsole to check the
maximum row size in each node, to see if you have a monster row.
Aaron
On 3/02/2011, at 10:22
Hey all,
I want to store some columns that are reminders to the users on my
application, in time sorted order in a row(timeline row of the user).
Would it be recommended to store these reminder columns in the
timeline row with column names like: combination of timestamp(of time
when the reminder
Is there any advantage to using supercolumns
(columnFamilyName[superColumnName[columnName[val]]]) instead of regular
columns with concatenated keys
(columnFamilyName[superColumnName@columnName[val]])?
When I designed my data model, I used supercolumns wherever I needed two
levels of key depth - j
> Is there any advantage to using supercolumns
> (columnFamilyName[superColumnName[columnName[val]]]) instead of regular
> columns with concatenated keys
> (columnFamilyName[superColumnName@columnName[val]])?
>
> When I designed my data model, I used supercolumns wherever I needed two
> levels of k
On Thu, Feb 3, 2011 at 11:27 AM, Aditya Narayan wrote:
> Hey all,
>
> I want to store some columns that are reminders to the users on my
> application, in time sorted order in a row(timeline row of the user).
>
> Would it be recommended to store these reminder columns in the
> timeline row with c
If I use : : :
as key pattern for the rows of reminders, then I am storing the key,
just as it is, as the column name and thus column values need not
contain a link to the row containing the reminder details.
I think UserId would be required along with timestamp in the key
pattern to provide un
Thanks Sylvain!
Can I vote for internally implementing supercolumn families as regular
column families? (With a smooth upgrade process that doesn't require
shutting down a live cluster.)
What if supercolumn families were supported as regular column families + an
index (on what used to be supercol
Hi
Would anyone recommend using Cassandra for storing hundreds of thousands
of documents in Word/PDF format? The manual says it can store documents
under 64MB with no issue but was wondering if anyone is using it for
this specific perpose. Would it be efficient/reliable and is there
anything I
If I use : | |
as key pattern for the rows of reminders, then I am storing the key,
just as it is, as the column name and thus column values need not
contain a link to the row containing the reminder details.
I think UserId would be required along with timestamp in the key
pattern to provide un
On Thu, Feb 3, 2011 at 1:33 PM, David Boxenhorn wrote:
> Thanks Sylvain!
>
> Can I vote for internally implementing supercolumn families as regular
> column families? (With a smooth upgrade process that doesn't require
> shutting down a live cluster.)
>
I forgot to add that I don't know if this
The advantage would be to enable secondary indexes on supercolumn families.
I understand from this thread that indexes are supercolumn families are not
going to be:
http://www.mail-archive.com/user@cassandra.apache.org/msg09527.html
Which, it seems to me, effectively deprecates supercolumn famil
Hi all,
Just for info, in apache-cassandra-0.6.11-bin.tar.gz there are both
apache-cassandra-0.6.10.jar and apache-cassandra-0.6.11.jar in the
lib directory.
Causing troubles to my upgrade scripts which use this file to get
installed version and check if upgrade needed . :(
Thanks for the g
On 02/02/2011 01:41 PM, Ryan King wrote:
> On Wed, Feb 2, 2011 at 10:40 AM, Chris Burroughs
> wrote:
>> I'm using 0.7.0 and experimenting with the new mx4j support.
>>
>> http://host:port/mbean?objectname=org.apache.cassandra.request%3Atype%3DReadStage
>>
>> Returns a nice pretty html page. For p
Well, that's odd. :)
Do any of the other tar.gz balls contain multiple jars?
On Thu, Feb 3, 2011 at 6:06 AM, Jean-Yves LEBLEU wrote:
> Hi all,
>
> Just for info, in apache-cassandra-0.6.11-bin.tar.gz there are both
> apache-cassandra-0.6.10.jar and apache-cassandra-0.6.11.jar in the
> lib direc
Don't known, only checked
http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.6.11/apache-cassandra-0.6.11-bin.tar.gz
Rgds.
JY
On Thu, Feb 3, 2011 at 3:36 PM, Jonathan Ellis wrote:
> Well, that's odd. :)
>
> Do any of the other tar.gz balls contain multiple jars?
>
> On Thu, Feb 3, 2011 at 6:0
On Thu, Feb 3, 2011 at 3:00 PM, David Boxenhorn wrote:
> The advantage would be to enable secondary indexes on supercolumn families.
>
Then I suggest opening a ticket for adding secondary indexes to supercolumn
families and voting on it. This will be 1 or 2 order of magnitude less work
than gett
On Thu, Feb 3, 2011 at 6:44 AM, Sylvain Lebresne wrote:
> On Thu, Feb 3, 2011 at 3:00 PM, David Boxenhorn wrote:
>>
>> The advantage would be to enable secondary indexes on supercolumn
>> families.
>
> Then I suggest opening a ticket for adding secondary indexes to supercolumn
> families and voti
CouchDB
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Using-Cassandra-to-store-files-tp5988698p5989122.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at
Nabble.com.
No idea. Is there an mx4j list you could try maybe? :)
On Wed, Feb 2, 2011 at 10:40 AM, Chris Burroughs
wrote:
> I'm using 0.7.0 and experimenting with the new mx4j support.
>
> http://host:port/mbean?objectname=org.apache.cassandra.request%3Atype%3DReadStage
>
> Returns a nice pretty html page.
Well, I am an "actual active developer" and I have "managed to do pretty
nice stuffs with Cassandra" - without secondary indexes so far. But I'm
looking forward to having secondary indexes in my arsenal when new
functional requirements come up, and I'm bummed out that my early design
decision to us
>
>
> CouchDB
>
That's not what document-oriented means! (har har)
I don't know all the details of your case, but with serving static files I
suspect you could do ok with something that has a much smaller memory/cpu
footprint as you won't have as great of write throughput / read latency
concerns.
Dear Brendan,
I would really be interested by your findings too. I need a system to store
various documents, I am thinking of Cassandra (that I am already using) or
using a second type of database or any other system. Maybe like dan
suggested, using mogilefs.
Thank you,
Victor Kabdebon
http://www
Jonathan pointed out in another thread that it looks like I'm running
into CASSANDRA-2059, where secondary files are not being properly
deleted. My production data set at any given time is less than 100 MB
in size, but the Cassandra data directories on each instance are using
30 to 40 times as much
On Thu, Feb 3, 2011 at 7:45 AM, Omer van der Horst Jansen
wrote:
> In the meantime, is it safe to manually delete stale files while
> Cassandra is running? And how do I determine when a set of files is
> stale?
>
> I'd assume that a given set of files is deletable if there is no
> -Data.db file a
Try adding this to the end of the URL: ?template=identity
On Thu, Feb 3, 2011 at 4:23 PM, Chris Burroughs
wrote:
> On 02/02/2011 01:41 PM, Ryan King wrote:
> > On Wed, Feb 2, 2011 at 10:40 AM, Chris Burroughs
> > wrote:
> >> I'm using 0.7.0 and experimenting with the new mx4j support.
> >>
> >>
On 02/03/2011 11:29 AM, Ran Tavory wrote:
> Try adding this to the end of the URL: ?template=identity
>
That works, thanks!
On Wed, 2011-02-02 at 21:04 +0200, Janne Jalkanen wrote:
> How about adding an autosignature with unsubscription info?
I might be overly cynical, but I'd wager that would serve no purpose
other than the comical value of seeing it appended to these unsubscribe
messages.
> /Janne
>
> On Feb 2, 201
> The correct way to accomplish what you describe is the new (in 0.7)
> per-column TTL. Simply set this to 60 * 60 * 24 * 90 (90 day's worth of
> seconds) and your columns will magically disappear after that length of
> time.
Although that assumes it's okay to loose data or that there is some
oth
The data provided is also a average value since boot time. Run the -x as
suggested below but run it via a interval of around 5 seconds. You very well
could be having i/o issue, it is hard to tell from the overall average value
you provided. Collect "iostat -x 5" during the times when you see slow r
Hundreds of thousands doesn't sound too bad. Good old NFS would do with an ok
directory structure.
We are doing this. Our documents are pretty small though (a few kb). We have
around 40M right now with around 300GB total.
Generally the problem is that much data usually means that cassandra beco
On Thu, Feb 3, 2011 at 6:49 AM, Jonathan Ellis wrote:
> On Thu, Feb 3, 2011 at 6:44 AM, Sylvain Lebresne wrote:
>> On Thu, Feb 3, 2011 at 3:00 PM, David Boxenhorn wrote:
>>>
>>> The advantage would be to enable secondary indexes on supercolumn
>>> families.
>>
>> Then I suggest opening a ticket
Hi all,
To generate new keys/ UserIds for new users on my application, I am
thinking of using a simple synchronized counter that can keep track of
the no. of users registered on my application and when a new user
signs up, he can be allotted the next available id.
Since Cassandra is eventually co
Unless you need your user identifiers to be sequential for some reason, I would
save yourself the headache of this kind of complexity and just use UUIDs if you
have to generate an identifier.
On Feb 3, 2011, at 2:03 PM, Aklin_81 wrote:
> Hi all,
> To generate new keys/ UserIds for new users on
Are you using Virtual Machines to run Cassandra? Ive found that performance
in VMs is crap
Nicolas Santini
On Thu, Feb 3, 2011 at 11:17 PM, aaron morton wrote:
> This page has a guide to setting the initial tokens for the nodes
> http://wiki.apache.org/cassandra/Operations#Ring_management
>
> <
You could also consider snowflake:
http://github.com/twitter/snowflake
which gives you ids that roughly sort by time (but aren't sequential).
-ryan
On Thu, Feb 3, 2011 at 11:13 AM, Matthew E. Kennedy
wrote:
> Unless you need your user identifiers to be sequential for some reason, I
> would sa
the pdf at the design doc
https://issues.apache.org/jira/secure/attachment/12459754/Partitionedcountersdesigndoc.pdf
does say so:
page 2 "- strongly consistent read: requires consistency level ALL.
(QUORUM is insufficient.)
"
but the wiki http://wiki.apache.org/cassandra/Counters
gave a code exa
>From the architecture section of wiki. And it makes sense!
More specifically: R=read replica count W=write replica count N=replication
factor Q=*QUORUM* (Q = N / 2 + 1)
-
If W + R > N, you will have consistency
- W=1, R=N
- W=N, R=1
- W=Q, R=Q where Q = N / 2 + 1
On Thu, Feb 3,
Thanks for the response, but unfortunately a TTL is not enough for us. We would
like to be able to dynamically control the window in case there is an unusually
large amount of data or something so we don't run out of disk space.
One question I have in particular is: if I use the timestamp of my
On Thu, Feb 3, 2011 at 6:44 AM, Sylvain Lebresne wrote:
> On Thu, Feb 3, 2011 at 3:00 PM, David Boxenhorn wrote:
>
>> The advantage would be to enable secondary indexes on supercolumn
>> families.
>>
>
> Then I suggest opening a ticket for adding secondary indexes to supercolumn
> families and vo
To be a little more clear, a simplified version of what I'm asking is:
Let's say you add 1K columns with timestamps 1 to 1000. Then, at an arbitrarily
distant point in the future, if you call remove on that CF with timestamp 500
(so the timestamps are logically out of order), will it delete exac
Hi guys,
I was playing around with the stress.py test this week and noticed a few
things.
1) Progress-interval does not always work correctly. I set it to 5 in the
example below, but am instead getting varying intervals:
*techlabs@cassandraN1:~/apache-cassandra-0.7.0-src/contrib/py_stress$ pytho
On Thu, Feb 3, 2011 at 7:02 PM, Sameer Farooqui wrote:
> Hi guys,
>
> I was playing around with the stress.py test this week and noticed a few
> things.
>
> 1) Progress-interval does not always work correctly. I set it to 5 in the
> example below, but am instead getting varying intervals:
>
Gener
On Thu, Feb 3, 2011 at 3:35 PM, Mike Malone wrote:
> It seems to me that super columns are a historical artifact from Cassandra's
> early life as Facebook's inbox storage system. They needed posting lists of
> messages, sharded by user. So that's what they built. In my dealings with
> the Cassandr
On Thu, Feb 3, 2011 at 3:59 PM, Jeffrey Wang wrote:
> To be a little more clear, a simplified version of what I'm asking is:
>
> Let's say you add 1K columns with timestamps 1 to 1000. Then, at an
> arbitrarily distant point in the future, if you call remove on that CF with
> timestamp 500 (so t
Unsubscribe, please.
On Feb 2, 2011, at 4:27 PM, buddhasystem wrote:
>
> Never mind, I found it in SVN...
> (not in gz)
>
> Thanks.
>
> --
> View this message in context:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Slow-network-writes-tp5985757p5986949.html
> Sent fro
Hello
Why i can get Unavalible Exception on live cluster (all nodes is up and
never shutdown)
PS: v 0.7.0
Dude, are you asking me to unsubscribe?
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Slow-network-writes-tp5985757p5991488.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at
Nabble.com.
I think that was originally a voice command - for whoever happened to hear
it first :-)
On Fri, Feb 4, 2011 at 9:57 AM, buddhasystem wrote:
>
> Dude, are you asking me to unsubscribe?
>
> --
> View this message in context:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Slow-n
Thanks Matthew & Ryan,
The main inspiration behind me trying to generate Ids in sequential
manner is to reduce the size of the userId, since I am using it for
heavy denormalization. UUIDs are 16 bytes long, but I can also have a
unique Id in just 4 bytes, and since this is just a one time process
Hi
I'll explain a bit. I'm working with Abhinav.
We've an application which was earlier based on Lucene which would
index a huge volume of data, and later use the indices to fetch data
and perform a fuzzy matching operation. We wanted to use Cassandra
primarily because of the sharding/availabilit
58 matches
Mail list logo