Hi,
I got a Cassandra keyspace, but while reading the data(especially UUID) via
Spark SQL using Python is not returning the correct value.
Cassandra:
--
My table 'SAM'' is described below:
CREATE table ks.sam (id uuid, dept text, workflow text, type double primary
key (id, dept))
Try converting that int from decimal to hex and inserting dashes in the
appropriate spots - or go the other way.
Also, you are looking at different rows, based upon your selection
criteria...
ml
On Tue, May 24, 2016 at 6:23 AM, Rajesh Radhakrishnan <
rajesh.radhakrish...@phe.gov.uk> wrote:
> Hi
Hi Michael,
Thank you for the quick reply.
So you are suggesting to convert this int value(UUID comes back as int via
Spark SQL) to hex?
And selection is just a example to highlight the UUID convertion issue.
So in Cassandra it should be
SELECT id, workflow FROM sam WHERE dept='blah';
And in S
Yes - a UUID is just a 128 bit value. You can view it using any base or
format.
If you are looking at the same row, you should see the same 128 bit value,
otherwise my theory is incorrect :)
Cheers,
ml
On Tue, May 24, 2016 at 6:57 AM, Rajesh Radhakrishnan <
rajesh.radhakrish...@phe.gov.uk> wrote
Hi experts,
We are evaluating Cassandra as messaging infrastructure for a project.
In our workflow Cassandra database will be synchronized across two nodes, a
component will INSERT/UPDATE records on one node and another component (who
has registered for the specific table) on second node will get
Sorry I should have more clear. What I meant was doing exactly what you wrote,
but do a “removenode” instead of “decommission” to make it even faster. Will
that have any side-effect (I think it shouldn’t) ?
From: Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com]
Sent: Monday, May 23, 2016 4:43 PM
T
The fundamental difference between a removenode and a decommission is which
node(s) stream data.
In decom, the leaving node streams.
In removenode, other owners of the data stream.
If you set replication factor for that DC to 0, there’s nothing to stream, so
it’s irrelevant – do whichever you l
It sounds like you're trying to build a queue in Cassandra, which is one of
the classic anti-pattern use cases for Cassandra.
You may be able to do something clever with triggers, but I highly
recommend you look at purpose-built queuing software such as Kafka to solve
this instead.
On Tue, May 24
Stop-The-World GC might block the connection until it times
out. This is the log that i think is relevant.
INFO 20160524-060930.028882 :: Initializing
sandbox_20160524_t06_09_18.table1
INFO 20160524-060933.908008 :: G1 Young Generation GC in 551ms. G1 Eden
Space: 98112 -> 0; G1
ht block the connection until it times
> out. This is the log that i think is relevant.
>
> INFO 20160524-060930.028882 :: Initializing
> sandbox_20160524_t06_09_18.table1
>
> INFO 20160524-060933.908008 :: G1 Young Generation GC in 551ms. G1 Eden
> Space: 98112 -> 0; G1
I'm not familiar with Titan's usage patterns for Cassandra, but I wonder if
this is because of the consistency level it's querying Cassandra at - i.e.
if CL isn't LOCAL_[something], then this might just be lots of little
checksums required to satisfy consistency requirements.
On Mon, May 23, 2016
I saw a thread from April 2016 talking about Cassandra and Kubernetes, and
have a few follow up questions. It seems that especially after v1.2 of
Kubernetes, and the upcoming 1.3 features, this would be a very viable
option of running Cassandra on.
My questions pertain to HostIds and Scaling Up/D
Here's my setup:
Datacenter: gce-us-central1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID
Rack
UN 10.128.0.3 6.4 GB 256 100.0%
3317a3de-9113-48e2-9a85-bbf7
an connect
>> to cassandra). And from cassandra log, we can see it takes roughly 3
>> seconds to do gc when there is an incoming connection. And the gc is the
>> only difference between the timeout connection and the successful
>> connection. So we suspect this Stop-The
+1 to what Eric said, a queue is a classic C* anti-pattern. Something like
Kafka or RabbitMQ might fit your use case better.
Mark
On 24 May 2016 at 18:03, Eric Stevens wrote:
> It sounds like you're trying to build a queue in Cassandra, which is one
> of the classic anti-pattern use cases for
Hi Luke,
You mentioned that replication factor was increased from 1 to 2. In that
case was the node bearing ip 10.128.0.20 carried around 3GB data earlier?
You can run nodetool repair with option -local to initiate repair local
datacenter for gce-us-central1.
Also you may suspect that if a lot o
Looking forward to hearing from the community about this.
Sent from my iPhone
> On May 24, 2016, at 10:19 AM, Mike Wojcikiewicz wrote:
>
> I saw a thread from April 2016 talking about Cassandra and Kubernetes, and
> have a few follow up questions. It seems that especially after v1.2 of
> Kub
So I guess the problem may have been with the initial addition of the
10.128.0.20 node because when I added it in it never synced data I guess?
It was at around 50 MB when it first came up and transitioned to "UN".
After it was in I did the 1->2 replication change and tried repair but it
didn't fix
For the other DC, it can be acceptable because partition reside on one
node, so say if you have a large partition, it may skew things a bit.
On May 25, 2016 2:41 AM, "Luke Jolly" wrote:
> So I guess the problem may have been with the initial addition of the
> 10.128.0.20 node because when I adde
Not necessarily considering RF is 2 so both nodes should have all
partitions. Luke, are you sure the repair is succeeding? You don't have
other keyspaces/duplicate data/extra data in your cassandra data directory?
Also, you could try querying on the node with less data to confirm if it
has the same
Hi Luke,
I've never found nodetool status' load to be useful beyond a general
indicator.
You should expect some small skew, as this will depend on your current
compaction status, tombstones, etc. IIRC repair will not provide
consistency of intermediate states nor will it remove tombstones, it onl
Hi Zhiyan,
Silly question but are you sure your heap settings are actually being
applied? "697,236,904 (51.91%)" would represent a sub-2GB heap. What's the
real memory usage for Java when this crash happens?
Other thing to look into might be memtable_heap_space_in_mb, as it looks
like you're usi
I am getting this error repeatedly while I am trying to add a new DC
consisting of one node in AWS to my existing cluster. I have tried 5 times
already. Running Cassandra 2.1.13
I have also set:
streaming_socket_timeout_in_ms: 360
in all of my nodes
Does anybody have any idea how this can be
Hi Luke, I've encountered similar problem before, could you please advise
on following?
1) when you add 10.128.0.20, what are the seeds defined in cassandra.yaml?
2) when you add 10.128.0.20, were the data and cache directories in
10.128.0.20 empty?
- /var/lib/cassandra/data
- /var/lib/cas
Hi George, are you using NetworkTopologyStrategy as the replication
strategy for your keyspace? If yes, can you check the
cassandra-rackdc.properties of this new node?
https://issues.apache.org/jira/browse/CASSANDRA-8279
Regards,
Mike Yeap
On Wed, May 25, 2016 at 2:31 PM, George Sigletos
wrote
25 matches
Mail list logo