I use grep / awk / sed from within a bash script ... this works quite well.
-sd
On Mon, Mar 21, 2011 at 12:39 AM, Anurag Gujral wrote:
> Hi All,
> I want to modify the values in the cassandra.yaml which comes with
> the cassandra-0.7 package based on values of hostnames,
> colo etc.
> D
I'll take another crack at it, here's how I think it works.
When using the NetworkTopologyStrategy you can specify how the RF is
distributed between the DC's you have. This is done as part of the schema
definition. When using a CLI script use the strategy_options clause of the
create keyspace s
After some investigations i think that my problems is similar to this :
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/reduced-cached-mem-resident-set-size-growth-td5967110.html
Now i disable mmap, and set disk_access_mode to mmap_index_only
Hi,
I'm inserting data from client node with stress.py to cluster of 6 nodes.
They are all on 1Gbps network, max real throughput of network is 930Mbps
(after measurement).
python stress.py -c 1 -S 17 -d{6nodes} -l3 -e QUORUM
--operation=insert -i 1 -n 50 -t100
The problem is stress.py
Jonathan Ellis gmail.com> writes:
>
> drop and truncate both snapshot first, which requires forking to run
> ln if you don't have JNA installed.
>
> best solution: install JNA so it can do in-process link calls.
>
Could you please tell exact actions in client (or not in client ?) that should
b
Hi Nikolay,
JNA has to be installed on the service box(es). On Ubuntu you can do the
following:
wget http://debian.riptano.com/debian/pool/libjna-java_3.2.7-0~nmu.2_amd64.deb
sudo dpkg -i libjna-java_3.2.7-0~nmu.2_amd64.deb
ln -s /usr/share/java/jna.jar [path_to_cassandra]/lib
... and the
Hi,
Anyone interested joining in Apache Cassandra hangout/meetup nearby *
mumbai-pune* area.
- Share/teach your exp with Apache Cassandra, problems/issue you faced
during deployment.
- **Excited and heard about its buzz, want to learn more about NoSQL
cassandra.
Regards,
GeekTalks
Beautiful, thanks.
On Sun, Mar 20, 2011 at 4:36 PM, Jonathan Ellis wrote:
> 0.7.1+ uses zero-copy reads in mmap'd mode so having 80k references to
> the same column is essentially just the reference overhead.
>
> On Fri, Mar 18, 2011 at 7:11 PM, Dan Retzlaff wrote:
> > Dear experts, :)
> > Our
I mean a linux process heap fragmentation by malloc, so at one critical
moment all memory holden by java process in RSS, and OS core cant allocate
any system resource an as result hung? Is it possble?
Thanks for sharing this. What mechanisms secure the data (streams?)
in transit between nodes? This isn't clear for me.
On Mon, Mar 21, 2011 at 10:01 AM, Milind Parikh wrote:
> Here's the document on Cassandra (0.7.4) across EC2 regions. Clearly this is
> work in progress but wanted to share
No. We do zero allocation by malloc (so far). It's all managed by GC in heap.
On Mon, Mar 21, 2011 at 10:25 AM, ruslan usifov wrote:
> I mean a linux process heap fragmentation by malloc, so at one critical
> moment all memory holden by java process in RSS, and OS core cant allocate
> any system
On Mon, Mar 21, 2011 at 4:02 AM, pob wrote:
> Hi,
> I'm inserting data from client node with stress.py to cluster of 6 nodes.
> They are all on 1Gbps network, max real throughput of network is 930Mbps
> (after measurement).
> python stress.py -c 1 -S 17 -d{6nodes} -l3 -e QUORUM
> --operatio
I think what I am trying to ask is this:
what happens if it's RF=3 with network toplogy (RackInferringSnitch) and 2
copies are stored in Site A and 1 copy in Site B data center. Now client for
some reason is directed to Site B data center and does a write/update on
existing column, now would Site
You mean,
more threads in stress.py? The purpose was figure out whats the
biggest bandwidth that C* can use.
Peter
2011/3/21 Ryan King
> On Mon, Mar 21, 2011 at 4:02 AM, pob wrote:
> > Hi,
> > I'm inserting data from client node with stress.py to cluster of 6 nodes.
> > They are all on 1Gbps
On Mon, Mar 21, 2011 at 9:34 AM, pob wrote:
> You mean,
> more threads in stress.py? The purpose was figure out whats the
> biggest bandwidth that C* can use.
You should try more threads, but at some point you'll hit diminishing
returns there. You many need to drive load from more than one host.
Thanks for sharing the document, Milind !
Followed the instructions and it worked for me.
On Mon, Mar 21, 2011 at 5:01 AM, Milind Parikh wrote:
> Here's the document on Cassandra (0.7.4) across EC2 regions. Clearly this is
> work in progress but wanted to share what I have. PDF is the working
Not completely related. just fyi.
I like it better to see the start time, end time, duration of each
execution in each thread. And then do the aggregation (avg,max,min)
myself.
I modified last few lines of the Inserter function as follows:
endtime = time.time()
self.latencies[self.idx
Hi Milind,
Great work here. Can you provide the patch against the 2 files?
Perhaps there's some way to incorporate it into the trunk of cassandra so
that this is feasible (in a future release) without patching the source
code.
Dave Viner
On Mon, Mar 21, 2011 at 9:41 AM, A J wrote:
> Thanks
I talked to Matt Dennis in the channel about it and I think everyone would like
to make sure that cassandra works great across multiple regions. He sounded
like he didn't know why it wouldn't work after having looked at the patches. I
would like to try it both ways - with and without the patch
Hi guys,
we are currently benchmarking various configurations of an EC2-based
Cassandra cluster. This is our current setup:
1) 8 nodes where each node is an m1.xlarge EC2 instance
2) Cassandra version 0.6.5
3) Replication Factor = 3
4) this delivers ~7K to 10K ops/sec with 50% GET and 50% INSERT
I suggest upgrading to either 0.6.12 or 0.7.4 and re-testing.
On Mon, Mar 21, 2011 at 12:52 PM, Markus Klems wrote:
> Hi guys,
>
> we are currently benchmarking various configurations of an EC2-based
> Cassandra cluster. This is our current setup:
>
> 1) 8 nodes where each node is an m1.xlarge EC
There are any benchmark that I can apply after install Cassandra on
Azure to check performance/scalability issues?
[]'s
FernandoVM
On Sun, Mar 13, 2011 at 10:16 PM, aaron morton wrote:
> If it works like all the other virtual machine hosts then yes it can be
> hosted.
> Performance can always b
I'm running 3-way CA cluster (0.62 ) on a windows 2008 (jre 1.6.24) 64-bit.
Things are running fine except when trying to remove old snaphsot files.
When running clearsnapshot I get an error msg like below. I can't remove
any daily snapshot files. When trying to delete the actual snapshot file os
> With the large new-gen, you were actually seeing fallbacks to full GC?
> You weren't just still experiencing problems because at 10 gig, the new-gen
> will be so slow to compact to effectively be similar to a full gc in terms of
> affecting latency?
Yes, we were seeing fallbacks to full GC with
This is a two part question ...
1. If you have cassandra nodes with different sized hard disks, how do you
deal with assigning the token ring such that the nodes with larger disks get
more data? In other words, given equally distributed token ranges, when the
smaller disk nodes run out of s
We use Puppet to manage the cassandra.yaml in a different location from the
installation. Ours is in /etc/cassandra/cassandra.yaml
You can set the environment CASSANDRA_CONF (i believe it is. check the
cassandra.in.sh) and the startup script will pick up this as the configuration
file to u
to elaborate:
our_temp_yaml=/tmp/$$.cassandra.yaml
cp cassandra.yaml $our_temp_yaml
for instance in $instances
# do some more work to get the hostname from the instance
sed -i "s/^seeds:/seeds: \n - $hostname/" $our_temp_yaml
done
-- the above inserts a new line for each $hostname into the t
No, replicas will always be directed to the same nodes. Otherwise we would not
know where to find it.
The OldNetworkTopologyStrategy alternated replicas between DC's , but it would
still always put them on the same nodes.
Aaron
On 22 Mar 2011, at 05:31, mcasandra wrote:
> I think what I a
contrib/py_stress is the easiest way to shake out any issues with your install
and get a benchmark.
There is also https://github.com/brianfrankcooper/YCSB but I would go with
py_stress until it stops been useful.
Note: These are abstract benchmarks to be used for entertainment purposes only,
I am trying to estimate the time it will take to rebuild a node. After
loading reasonable data, I brought down a node and manually removed
all its datafiles for a given keyspace (Keyspace1)
I then restarted the node and got i back in the ring. At this point, I
wish to run nodetool repair (bin/nodet
Sorry if I was presumptuous earlier. I created a ticket so that the patch
could be submitted and reviewed - that is if it can be generalized so that it
works across regions and doesn't adversely affect the common case.
https://issues.apache.org/jira/browse/CASSANDRA-2362
On Mar 21, 2011, at 10:
There have been some issues to with deleting files on windows, cannot find a
reference to it happening for snapshots.
If you restart the node can you delete the snapshot?
Longer term can you upgrade to 0.6.12 and let us know if it happens again? Any
fix will be against that version.
Hope th
1) You should use nodes with the same capacity (CPU, RAM, HDD), cassandra
assumes they are all equal.
2) Not sure what exactly would happen. Am guessing either the node would
shutdown or writes would eventually block, probably the former. If the node was
up read performance may suffer (if ther
Are you monitoring the progress http://wiki.apache.org/cassandra/Streaming ?
or with nodetool netstats
Aaron
On 22 Mar 2011, at 16:33, A J wrote:
> I am trying to estimate the time it will take to rebuild a node. After
> loading reasonable data, I brought down a node and manually removed
> all
Patch is attached... I don't have access to Jira.
A cautionery note: This is NOT a general solution and is not intended as
such. It could be included as a part of larger patch. I will explain in the
limitation sections about why it is not a general solution; as I find time.
Regards
Milind
On Mon
I set join_ring=false in my java opts:
-Djoin_ring=false
However, when the node started up, it joined the ring. Is there
something I am missing? Using 0.7.4
Thanks,
Jason
-Dcassandra.join_ring=false
-Chris
On Mar 21, 2011, at 10:32 PM, Jason Harvey wrote:
> I set join_ring=false in my java opts:
> -Djoin_ring=false
>
> However, when the node started up, it joined the ring. Is there
> something I am missing? Using 0.7.4
>
> Thanks,
> Jason
Gah! Thx :)
Jason
On Mar 21, 10:34 pm, Chris Goffinet wrote:
> -Dcassandra.join_ring=false
>
> -Chris
>
> On Mar 21, 2011, at 10:32 PM, Jason Harvey wrote:
>
> > I set join_ring=false in my java opts:
> > -Djoin_ring=false
>
> > However, when the node started up, it joined the ring. Is there
> >
38 matches
Mail list logo