Thanks Maki :)
I copied the existing var folder to the new hardisk
and changes the path to the data directories in the storage-config.xml
I was successfully able to connect with cassandra and read the data that was
shifted to the new location.
On Fri, Mar 18, 2011 at 6:33 AM, Maki Watanabe
| data_file_directories makes it seem as though cassandra can use more than one
location for sstable storage. Does anyone know how it splits up the data
between partitions? I am trying to plan for just about every worst case
scenario I can right now, and I want to know if I can change the config
Refer to:
http://wiki.apache.org/cassandra/StorageConfiguration
You can specify the data directories with following parameter in
storage-config.xml (or cassandra.yaml in 0.7+).
commit_log_directory : where commitlog will be written
data_file_directories : data files
saved_cache_directory : saved
Also when it comes to RAID controller there are other options like write
policy, read policy, cache io/direct io. Is there any preference on which
policies should be chosen?
In our case:
http://support.dell.com/support/edocs/software/svradmin/1.9/en/stormgmt/cntrls.html
--
View this message in c
Where and how do I choose it?
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Does-concurrent-reads-relate-to-number-of-drives-in-RAID0-tp6182346p6183069.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at
Nabble.com.
Thanks Peter, I can see it better now.
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Does-concurrent-reads-relate-to-number-of-drives-in-RAID0-tp6182346p6183051.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at
Nab
> The reason for this is that you want to be able to saturate your
> storage subsystem, and that means keeping all spindles working at all
> times and efficiently. This is accomplished by ensuring you are able
> to sustain a sufficient queue depth (number of outstanding commands)
> on each device.
> Thanks to all for replying, but frankly I didn't get the answer I wanted.
> Does the "number of disks" apply to number of spindles in RAID0? Or
> something else like a separate disk for commitlog and for data?
The number of actual disks (spindles) in the device on which your
sstables are on (not
Thanks to all for replying, but frankly I didn't get the answer I wanted.
Does the "number of disks" apply to number of spindles in RAID0? Or
something else like a separate disk for commitlog and for data?
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nab
> The comment in the example config file next to that setting explains it more
> fully, but something like 16 * number of drives is a reasonable setting for
> readers. Writers should be a multiple of the number of cores.
In addition, if you're running on Linux in a situation where you're
trying to
Of course! why didn't i think of that? Thanks!!
On Mar 17, 2011, at 3:11 PM, Edward Capriolo wrote:
> On Thu, Mar 17, 2011 at 9:09 AM, Jonathan Colby
> wrote:
>> Hi -
>>
>> If a seed crashes (i.e., suddenly unavailable due to HW problem), what is
>> the best way to replace the seed in the c
Do people have success stories with 0.7.4? It seems like the list only
hears if there's a major problem with a release, which means that if you're
trying to judge the stability of a release you're looking for silence. But
maybe that means not many people have tried it yet. Is there a record of
t
At this moments java hungs. Only one thread is work and it run mostly in OS
core, with follow trace:
[pid 1953] 0.050157 futex(0x7fbe141ea428, FUTEX_WAKE_PRIVATE,
1) = 0 <0.22>
[pid 1953] 0.59 futex(0x7fbc24023794,
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {13002023
Hi Paul,
It's more of a scientific mining app. We crawl websites and extract
information from these websites for our clients. For us, it doesn't really
matter if one cassandra node replies after 1 second or a few ms, as long as
the throughput over time stays high. And so far, this seems to be the
As for the version,
we will wait a few more days, and if nothing really bad shows up, move to
0.7.4.
On Thu, Mar 17, 2011 at 10:40 PM, Thibaut Britz <
thibaut.br...@trendiction.com> wrote:
> Hi Paul,
>
> It's more of a scientific mining app. We crawl websites and extract
> information from thes
Good work.
Aaron
On 17/03/2011, at 4:37 PM, Jonathan Ellis wrote:
> Thanks for tracking that down, Roland. I've created
> https://issues.apache.org/jira/browse/CASSANDRA-2347 to fix this.
>
> On Wed, Mar 16, 2011 at 10:37 AM, Roland Gude
> wrote:
>> I have applied the suggested changes in m
Thanks for the response, sorry if my initial question wasn't clear.
When using thrift, I call
client.get_slice(keyBytes, columnParent, range, level)
i get a list of ColumnOrSuperColumns back. When I iterate over them and and
call:
byte[] nameBytes = columnOrSuperColumn.getSuper_column().getNa
The comment in the example config file next to that setting explains it more
fully, but something like 16 * number of drives is a reasonable setting for
readers. Writers should be a multiple of the number of cores.
On Thu, Mar 17, 2011 at 1:09 PM, buddhasystem wrote:
> Hello, in the instructions
Depending on your memtable thresholds the heap may be too small for the
deployment. At the same time I don't see any other log statements around
that long pause that you have shown in the log snippet. It looks little odd
to me. All the ParNew collected almost same amount of heap and did not take
lo
Hello, in the instructions, I need to link "concurrent_reads" to number of
drives. Is this related to number of physical drives that I have in my
RAID0, or something else?
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Does-concurrent-reads-relat
>From OrderPreservingPartition.java
public StringToken getToken(ByteBuffer key)
{
String skey;
try
{
skey = ByteBufferUtil.string(key, Charsets.UTF_8);
}
catch (CharacterCodingException e)
{
throw new RuntimeException(
Cool - let me know if you have any questions if you do. I'm @jeromatron in irc
and on twitter.
On Mar 17, 2011, at 1:10 PM, Ethan Rowe wrote:
> Thanks, Jeremy. I looked over the work that was done and it seemed like it
> was mostly there, though some comments in the ticket indicated possible
Thanks, Jeremy. I looked over the work that was done and it seemed like it
was mostly there, though some comments in the ticket indicated possible
problems.
I may well need to take a crack at this sometime in the next few weeks, but
if somebody beats me to it, I certainly won't complain.
On Thu,
I started it and added the tentative patch at the end of October. It needs to
be rebased with the current 0.7-branch and completed - it's mostly there. I
just tried to abstract some things in the process.
I have changed jobs since then and I just haven't had time with the things I've
been doi
Please can any one give their comment on this
On 03/17/2011 07:02 PM, Ali Ahsan wrote:
Dear Aaron,
We are little confused about OPP token.How to calculate OPP Token? Few
of our column families have UUID as key and other's have integer as key.
I wonder what it the right way to configure replication in Cassandra cluster.
I need to have 3 copies of my data in a cluster consisting of 6 nodes.
3 of these nodes are in one datacenter - let's call it DC1 - and 3 in
another, DC2. There is a significant latency between these datacenters
and orig
Hello.
What's the current thinking on input support for Hadoop streaming? It seems
like the relevant Jira issue has been quiet for some time:
https://issues.apache.org/jira/browse/CASSANDRA-1497
Thanks.
- Ethan
On 3/17/2011 1:06 PM, Thibaut Britz wrote:
> If it helps you to sleep better,
>
> we use cassandra (0.7.2 with the flush fix) in production on > 100
> servers.
>
> Thibaut
>
Thanks Thibaut, believe it or not, it does. :)
Is your use case a typical web app or something like a scientific/data
mini
Thanks Jonathan, Aaron, Daniel! I have a related question.
I would like to get a copy of data from these 12-server cluster with
manually assigned babanced server tokens, and set it up on a new cluster. I
would like to minimize the number of the server on the new cluster without
having to build
If it helps you to sleep better,
we use cassandra (0.7.2 with the flush fix) in production on > 100 servers.
Thibaut
On Thu, Mar 17, 2011 at 5:58 PM, Paul Pak wrote:
> I'm at a crossroads right now. We built an application around .7 and
> the features in .7, so going back to .6 wasn't an opt
It is still there, but we took it out of the sample config because
people think it affects normal writes which it does not.
On Thu, Mar 17, 2011 at 11:48 AM, A J wrote:
> I don't see binary_memtable_throughput_in_mb parameter in
> cassandra.yaml anymore.
> What is it replaced by ?
>
> thanks.
>
>
I'm at a crossroads right now. We built an application around .7 and
the features in .7, so going back to .6 wasn't an option for us. Now,
we are in the middle of setting up dual mysql and cassandra support so
that we can "fallback" to mysql if Cassandra can't handle the workload
properly. It's
Yes thanks I was able to see that .
Now I am getting the following error
OutboundTcpConnection.java (line 159) attempting to connect to astrix.com
where astrix.com is the machine on which I have installed cassandra
Any suggestions.
Thanks
Anurag
On Thu, Mar 17, 2011 at 9:49 AM, Jonathan Ellis
Internal error means "there is a stacktrace in the server system.log"
and in this case probably also means "you sent some kind of invalid
request that our validation didn't catch."
On Thu, Mar 17, 2011 at 11:29 AM, Anurag Gujral wrote:
> Thanks for the reply. I added mutation.__isset.column_or_su
I don't see binary_memtable_throughput_in_mb parameter in
cassandra.yaml anymore.
What is it replaced by ?
thanks.
On Tue, Mar 15, 2011 at 11:32 PM, Eric Evans wrote:
> On Tue, 2011-03-15 at 22:19 -0500, Eric Evans wrote:
>> On Tue, 2011-03-15 at 14:26 -0700, Mark wrote:
>> > Still not seeing 0.
2011/3/17 Narendra Sharma
> What heap size are you running with? and Which version of Cassandra?
>
> 4G with cassandra 0.7.4
What heap size are you running with? and Which version of Cassandra?
Thanks,
Naren
On Thu, Mar 17, 2011 at 3:45 AM, ruslan usifov wrote:
> Hello
>
> Some times i have very long GC pauses:
>
>
> Total time for which application threads were stopped: 0.0303150 seconds
> 2011-03-17T13:19:56.476+030
Thanks for the reply. I added mutation.__isset.column_or_supercolumn=true;
Now I am getting TApplicationException: Internal error processing
batch_mutate
Any suggestions?
Thanks
Anurag
On Thu, Mar 17, 2011 at 8:13 AM, Anurag Gujral wrote:
> Hi All,
> I am using function batch_mutate o
You need to set the __isset on the Mutation object as well.
On Thu, Mar 17, 2011 at 10:13 AM, Anurag Gujral wrote:
> Hi All,
> I am using function batch_mutate of cassandra 0.7 and I am
> getting the error InvalidRequestException: Mutation must have one
> ColumnOrSuperColumn or one Dele
Hi All,
I am using function batch_mutate of cassandra 0.7 and I am getting
the error InvalidRequestException: Mutation must have one
ColumnOrSuperColumn or one Deletion. I have my own C++ cassandra client
using thrift 0.0.5 api.
Any Suggestions.
Sample Code
map > cfmap;
vector
Hi,
I am having single node cassandra setup on a windows machine.
Very soon I have ran out of space on this machine so have increased the
hardisk capacity of the machine.
Now I want to know how I configure cassandra to start storing data in these
high space partitions?
Also how the existing data
thanks Jeremy, its good pointer to start with
regards
Sagar
From: Jeremy Hanna [jeremy.hanna1...@gmail.com]
Sent: Thursday, March 17, 2011 7:34 PM
To: user@cassandra.apache.org
Subject: Re: hadoop cassandra
You can start with a word count example that's on
On Thu, Mar 17, 2011 at 9:09 AM, Jonathan Colby
wrote:
> Hi -
>
> If a seed crashes (i.e., suddenly unavailable due to HW problem), what is
> the best way to replace the seed in the cluster?
>
> I've read that you should not bootstrap a seed. Therefore I came up with
> this procedure, but it
You can start with a word count example that's only for hdfs. Then you can
replace the reducer in that with the ReducerToCassandra that's in the cassandra
word_count example. You need to match up your Mapper's output to the Reducer's
input and set a couple of configuration variables to tell it
Dear Aaron,
We are little confused about OPP token.How to calculate OPP Token? Few
of our column families have UUID as key and other's have integer as key.
On 03/17/2011 04:22 PM, Ali Ahsan wrote:
Below is the ouput of nodetool ring
Address Status Load
Range
Are you sure you don't have a problem with handling ByteBuffers ?
What do you mean by 'deserialized string' ?
--
Sylvain
On Thu, Mar 17, 2011 at 4:20 AM, Michael Fortin wrote:
> Hi,
>
> I've been working on a scala based api for cassandra. I've built it directly
> on top of thrift. I'm having
We're aware of the potential for races during schema change but it
looks like we missed this one. Can you create a ticket?
On Wed, Mar 16, 2011 at 11:55 PM, Jeffrey Wang wrote:
> Hey all,
>
>
>
> I’m running 0.7.0 on a cluster of 5 machines. When I create a new column
> family after I run nodeto
Remove the cache file or upgrade to 0.7.4
On Thu, Mar 17, 2011 at 1:15 AM, Anurag Gujral wrote:
> I am getting exception when starting cassandra 0.7.3
>
> ERROR 01:10:48,321 Exception encountered during startup.
> java.lang.NegativeArraySizeException
> at
> org.apache.cassandra.db.ColumnFamil
I see super-column-0 in there. Not sure what the question is.
On Wed, Mar 16, 2011 at 10:20 PM, Michael Fortin wrote:
> Hi,
>
> I've been working on a scala based api for cassandra. I've built it directly
> on top of thrift. I'm having a problem getting a slice of a superColumn.
> When I ge
Hi -
If a seed crashes (i.e., suddenly unavailable due to HW problem), what is the
best way to replace the seed in the cluster?
I've read that you should not bootstrap a seed. Therefore I came up with this
procedure, but it seems pretty complicated. any better ideas?
1. update the seed l
Below is the ouput of nodetool ring
Address Status Load
Range Ring
TuL8jLqs7uxLipP6
192.168.100.3 Up 89.91 GB
JDtVOU0YVQ6MtBYA |<--|
192.168.100.4 Up 48
Hello
Some times i have very long GC pauses:
Total time for which application threads were stopped: 0.0303150 seconds
2011-03-17T13:19:56.476+0300: 33295.671: [GC 33295.671: [ParNew:
678855K->20708K(737280K), 0.0271230 secs] 1457643K->806795K(4112384K),
0.027305
0 secs] [Times: user=0.33 sys=0.0
With the Order Preserving Partitioner you are responsible for balancing the
rows around the cluster,
http://wiki.apache.org/cassandra/Operations?highlight=%28partitioner%29#Token_selection
Was there a reason for using the ordered partitioner rather than the random
one?
What does the output
Hi All
We are running Cassandra 0.6.3,We have two node's with replication
factor one and ordered partitioning.Problem we are facing at the moment
all data is being send to one Cassandra node and its filling up quite
rapidly and we are short of disk space.Unfortunately we have hardware
constr
hi all,
is there any example of hadoop and cassandra integration where input is from
hdfs and out put to cassandra
NOTE: i have gone through word count example provided with the source code, but
it does not have above case..
regards
Sagar
Are you exploring a
55 matches
Mail list logo