For plain old log analysis the Cloudera Hadoop distribution may be a better
match. Flume is designed to help with streaming data into HDFS, the LZo
compression extensions would help with the data size and PIG would make the
analysis easier (IMHO).
http://www.cloudera.com/hadoop/
http://www.clou
>
> The goal is actually getting the rows in the range of "start","end"The order
is not important at all.But what I can see is, this does not seem to be possible
at all using RP. Am I wrong?
Simpler solution is just compare MD5 of both keys and set start to one with
lesser md5 and end to key with
Hello, Aaron,
Thank you for much info (especially pointers that seem interesting).
> So you would not have 1,000 tasks sent to each of the 1,000 cassandra
nodes.
Yes, I meant one map task would be sent to each task tracker, resulting in
1,000 concurrent map tasks in the cluster. ColumnFamilyInpu
I may be wrong about which nodes the task is sent to.
Others here know more about hadoop integration.
Aaron
On 22 Oct 2010, at 21:30, Takayuki Tsunakawa
wrote:
> Hello, Aaron,
>
> Thank you for much info (especially pointers that seem interesting).
>
> > So you would not have 1,000 t
I try play with cassandra 0.7 (i build it from trunk) and its looks better
then 0.6 brunch, but when i try to add new node with auto_bootstrap: true i
got NPE (192.168.0.37 initial node with data on it, 192.168.0.220
bootstraped node):
DEBUG 14:00:58,931 Checking to see if compaction of Schema wou
Ever since I started implementing my second level caches I've been wondering
on how to deal with this, and thus far I've not found a good solution.
I have a CF acting as a secondary index, and I want to make range queries
against it. Since my keys are Long I simply went ahead and wrote them as
the
Prepend zeros to every number out to a fixed length determined by the
maximum possible value. As an example, 0055 < 0100 in a lexical ordering
where the maximum value is .
On Fri, Oct 22, 2010 at 5:05 AM, Christian Decker <
decker.christ...@gmail.com> wrote:
> Ever since I started implementi
That gets you keys whose MD5s are between the MD5s of start and end,
which is not the same as the keys between start and end.
On Fri, Oct 22, 2010 at 2:07 AM, Oleg Anastasyev wrote:
>>
>> The goal is actually getting the rows in the range of "start","end"The order
> is not important at all.But wh
On Fri, Oct 22, 2010 at 3:30 AM, Takayuki Tsunakawa
wrote:
> Yes, I meant one map task would be sent to each task tracker, resulting in
> 1,000 concurrent map tasks in the cluster. ColumnFamilyInputFormat cannot
> identify the nodes that actually hold some data, so the job tracker will
> send the
> Specifically I'm wondering if I could create a byte representation of the Long
> that would also be lexicographically ordered.
This is probably what you want to do, combined with the ByteOrderedPartitioner
in 0.7
-Original Message-
From: "Eric Czech"
Sent: Friday, October 22, 2010 7:05
I'm coming to the portion of the Cassandra installation where the customer is
looking for benchmarking and testing for purposes of "keeping an eye" on the
system to see if we need to add capacity or just to see how the system in
general is doing. Basically, warm fuzzies that the system is still
This was a regression from the Thrift 0.5 upgrade. Should be fixed in r1026415
On Fri, Oct 22, 2010 at 5:11 AM, ruslan usifov wrote:
> I try play with cassandra 0.7 (i build it from trunk) and its looks better
> then 0.6 brunch, but when i try to add new node with auto_bootstrap: true i
> got NP
Thanks very much, that did the trick :)
On Thu, Oct 21, 2010 at 9:28 PM, Aaron Morton wrote:
> Look for lib/thrift-rX.jar in the source. is the svn revision to
> use.
>
> http://wiki.apache.org/cassandra/InstallThrift
>
> Not sure if all those steps still apply, but it's what I did last
Not with the nodeprobe or nodetool command because the JVM these two
commands spawn has a very short life span.
I am using a webapp to monitor my cassandra cluster. It pretty much uses
the same code as NodeCmd class. For each incoming request, it creates an
NodeProbe object and use it to get get
Hello
Does anybody have receipt how possible effectively hold Bond graph in
cassandra. For example relations between users in social
networks(friendship).
Simplest that comes to mind is follow keyspace
But this have a minus, if one user have many many friends, and all relations
for this o
Unless one user has several hundred million friends, this shouldn't be a
problem.
- Tyler
On Fri, Oct 22, 2010 at 3:00 PM, ruslan usifov wrote:
> Hello
>
> Does anybody have receipt how possible effectively hold Bond graph in
> cassandra. For example relations between users in social
> networks
Is the fix as simple as calling close() then? Can you submit a patch for that?
On Fri, Oct 22, 2010 at 2:49 PM, Bill Au wrote:
> Not with the nodeprobe or nodetool command because the JVM these two
> commands spawn has a very short life span.
>
> I am using a webapp to monitor my cassandra clust
Riptano is bringing some Cassandra love to the East coast the first
week of November.
First, on the evening of Nov 3, we're sponsoring a meetup in Atlanta.
This is held at the ApacheCon venue but you do _not_ have to be going
to ApacheCon to come; it is free to attend! I will be there and
several
I am currently running a 4 node cluster on Cassandra beta 2. Yesterday, I
ran into a number of problems and the one of my nodes went down for a few
hours. I tried to run a nodetool repair and at least at a data level,
everything seems to be consistent and alright. The problem is that the node
is st
remove
Hi,
I'm testing Cassandra to ensure it fits my needs. One of the tests I
want to perform is writing while a node is down. Here's the scenario:
Cassandra 0.6.6
2 nodes
replication factor of 2
hinted handoff on
I load node A with 50,000 rows while B is shutdown (BTW, I'm using
CL.ONE during the
On 10/22/10 2:55 PM, Craig Ching wrote:
Even better, I'd love a way to not allow B to be available
until replication is complete, can I detect that somehow?
Proposed and rejected a while back :
https://issues.apache.org/jira/browse/CASSANDRA-768
=Rob
The last time this came up on the list Jonathan Ellis said (something
along the lines of) if your application can't tolerate stale data then
you should read with a consistency level of QUORUM.
It would be nice if there was some sort of middle ground for an
application that can tolerate slightly st
When using nodetool move command, the streaming between nodes got stuck for a
long period like the following:
Streaming from: /10.100.10.66
Profile:
/opt/choicestream/data/cassandra/data/Profile/U_Profiles-tmp-1137-Index.db
0/809960194
Profile:
/opt/choicestream/data/cassandra/data/Profi
This is a known bug in early 0.6, fixed in 0.6.5 iirc. But at this
point you should upgrade to 0.6.6.
On Fri, Oct 22, 2010 at 8:52 PM, Henry Luo wrote:
> When using nodetool move command, the streaming between nodes got stuck for
> a long period like the following:
>
>
>
> Streaming from: /10.10
Remove
26 matches
Mail list logo