OK, thanks for getting me going in the right direction. I imagine most people
would store password and tokenized authentication information in a single
table, using the username (e.g. email address) as the key?
On Dec 11, 2013, at 10:44 PM, Janne Jalkanen wrote:
>
> Hi!
>
> You're right, th
Hi!
You're right, this isn't really Cassandra-specific. Most languages/web
frameworks have their own way of doing user authentication, and then you just
typically write a plugin that just stores whatever data the system needs in
Cassandra.
For example, if you're using Java (or Scala or Groovy
Not sure if you are asking about the authentication & authorisation in
cassandra or how to implemented the same using cassandra.
info on the cassandra authentication and authorisation is here
http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html#cassandra/security/securityTOC.h
> It is the write latency, read latency is ok. Interestingly the latency is low
> when there is one node. When I join other nodes the latency drops about 1/3.
> To be specific, when I start sending traffic to the other nodes the latency
> for all the nodes increases, if I stop traffic to other n
> What do people recommend I do to store a small binary value in a column? I’d
> rather not simply use a 32-bit int for a single byte value.
blob is a byte array
or you could use the varint, a variable length integer, but you probably want
the blob.
cheers
-
Aaron Morton
New Z
If you don’t need to use Hadoop then try the SSTableSimpleWriter and
sstableloader , this post is a little old but still relevant
http://www.datastax.com/dev/blog/bulk-loading
Otherwise AFAIK BulkOutputFormat is what you want from hadoop
http://www.datastax.com/docs/1.1/cluster_architecture/had
You need to specify all the clustering key components in the CLUSTERING ORDER
BY clause
create table demo(oid int,cid int,ts timeuuid,PRIMARY KEY (oid,cid,ts)) WITH
CLUSTERING ORDER BY (cid ASC, ts DESC);
cheers
-
Aaron Morton
New Zealand
@aaronmorton
Co-Founder & Principal C
thanks, looks handy.
Cheers
-
Aaron Morton
New Zealand
@aaronmorton
Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com
On 12/12/2013, at 6:16 am, Parth Patil wrote:
> Hi Maciej,
> This looks great! Thanks for building this.
>
>
> On W
> SYSTEM_MANAGER.create_column_family('Narrative','Twitter_search_test',
> comparator_type='CompositeType', default_validation_class='UTF8Type',
> key_validation_class='UTF8Type', column_validation_classes=validators)
>
CompositeType is a type composed of other types, see
http://
If you have two nodes, and RF 2, you will only be able to use eventual
consistency. If you want to have stronger consistency and some redundancy 3
nodes is the minimum requirement.
In the current setup, with only 2 nodes, I would use RAID 10 as it requires
less operator intervention and there
On Wed, Dec 11, 2013 at 6:27 AM, Mathijs Vogelzang
wrote:
> When I use sstable2json on the sstable on the destination cluster, it has
> "metadata": {"deletionInfo":
> {"markedForDeleteAt":1796952039620607,"localDeletionTime":0}}, whereas
> it doesn't have that in the source sstable.
> (Yes, this i
Hi,
I’m using Cassandra in an environment where many users can login to use an
application I’m developing. I’m curious if anyone has any advice or links to
documentation / blogs where it discusses common implementations or best
practices for user and password authentication. My cursory search o
> Querying the table was fast. What I didn’t do was test the table under load,
> nor did I try this in a multi-node cluster.
As the number of columns in a row increases so does the size of the column
index which is read as part of the read path.
For background and comparisons of latency see
h
> Caused by: java.io.IOException: PIG_INPUT_INITIAL_ADDRESS or
> PIG_INITIAL_ADDRESS environment variable not set
> at
> org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(CassandraStorage.java:314)
> at
> org.apache.cassandra.hadoop.pig.CassandraStorage.getSchema(Cassandra
> [2013-12-08 11:04:02,047] Repair session ff16c510-5ff7-11e3-97c0-5973cc397f8f
> for range (1246984843639507027,1266616572749926276] failed with error
> org.apache.cassandra.exceptions.RepairException: [repair
> #ff16c510-5ff7-11e3-97c0-5973cc397f8f on keyspace_name/col_family1,
> (12469848436
Thanks Aaron
On Wed, Dec 11, 2013 at 8:15 PM, Aaron Morton wrote:
> Changed memtable_total_space_in_mb to 1024 still no luck.
>
> Reducing memtable_total_space_in_mb will increase the frequency of
> flushing to disk, which will create more for compaction to do and result in
> increased IO.
>
> Y
> create table messages(
> body text,
> username text,
> tags set
> PRIMARY keys(username,tags)
> )
This statement is syntactically invalid, also you cannot use a collection type
in the primary key.
> 1) I should be able to query by username and get all the messages for
Do you have the back trace for from the heap dump so we can see what the array
was and what was using it ?
Cheers
-
Aaron Morton
New Zealand
@aaronmorton
Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com
On 10/12/2013, at 4:41 am, Klaus
> Changed memtable_total_space_in_mb to 1024 still no luck.
Reducing memtable_total_space_in_mb will increase the frequency of flushing to
disk, which will create more for compaction to do and result in increased IO.
You should return it to the default.
> when I send traffic to one node its per
Thanks Rob.
Well, I ran the repair only against the empty keyspace. C* version is
1.2.8. I guess I'll try to recreate it some time next week on two or three
test hosts and if the same behaviour occurs I'll file a bug report.
Cheers,
Sven
On Thu, Dec 12, 2013 at 12:58 PM, Robert Coli wrote:
>
> What is the good practice to put in the code as addContactPoint ie.,how many
> servers ?
I use the same nodes as the seed list nodes for that DC.
The idea of the seed list is that it’s a list of well known nodes, and it’s
easier operationally to say we have one list of well known nodes that i
On Wed, Dec 11, 2013 at 1:35 AM, Sven Stark wrote:
> thanks for replying. Could you please be a bit more specific, though. Eg
> what exactly is being compacted - there is/was no data at all in the
> cluster save for a few hundred kB in the system CF (see the nodetool status
> output). Or - how can
Column metadata is about 20 bytes. So, there is no big difference if you
save 1 or 4 bytes.
Thank you,
Andrey
On Wed, Dec 11, 2013 at 2:42 PM, onlinespending wrote:
> What do people recommend I do to store a small binary value in a column?
> I’d rather not simply use a 32-bit int for a single
What do people recommend I do to store a small binary value in a column? I’d
rather not simply use a 32-bit int for a single byte value. Can I have a one
byte blob? Or should I store it as a single character ASCII string? I imagine
each is going to have the overhead of storing the length (or nul
Hi All,
I want to bulk insert data into cassandra. I was wondering of using
BulkOutputformat in hadoop. Is it the best way or using driver and doing
batch insert is the better way.
Are there any disandvantages of using bulkoutputformat.
Thanks for helping
Varun
Hi All,
My Usecase
I want query result by ordered by timestamp DESC. But I don't want
timestamp to be the second column in the primary key as that will take of
my querying capability
for example
create table demo(oid int,cid int,ts timeuuid,PRIMARY KEY
(oid,cid,ts)) WITH CLUSTERING ORDER BY (ts
This works, When I remove the comparator_type
validators = {
'tid': 'IntegerType',
'approved': 'BooleanType',
'text': 'UTF8Type',
'favorite_count':'IntegerType',
'retweet_count': 'IntegerType',
'expanded_url': 'UTF8Type
I am using ccm cassandra version
*1.2.11*
On Wed, Dec 11, 2013 at 12:19 PM, Kumar Ranjan wrote:
> validators = {
>
> 'approved': 'BooleanType',
>
> 'text': 'UTF8Type',
>
> 'favorite_count':'IntegerType',
>
> 'retweet_count': 'IntegerType',
>
> 'expanded_url':
validators = {
'approved': 'BooleanType',
'text': 'UTF8Type',
'favorite_count':'IntegerType',
'retweet_count': 'IntegerType',
'expanded_url': 'UTF8Type',
'tuid': 'LongType',
'screen_name': 'UTF8Type',
'profile_image': 'UTF8Type',
Hi Maciej,
This looks great! Thanks for building this.
On Wed, Dec 11, 2013 at 12:45 AM, Murali wrote:
> Hi Maciej,
> Thanks for sharing it.
>
>
>
>
> On Wed, Dec 11, 2013 at 2:09 PM, Maciej Miklas wrote:
>
>> Hi all,
>>
>> This is the Cassandra mailing list, but I've developed something that i
Hey Folks,
So I am creating, column family using pycassaShell. See below:
validators = {
'approved': 'BooleanType',
'text': 'UTF8Type',
'favorite_count':'IntegerType',
'retweet_count': 'IntegerType',
'expanded_url': 'UTF8Type',
'tuid': 'LongTy
What options are available depends on what version of Cassandra you're
using.
You can specify the row key type with 'key_validation_class'.
For column types, use 'column_validation_classes', which is a dict mapping
column names to types. For example:
sys.create_column_family('mykeyspace', 'user
What are the all possible values for cf_kwargs ??
SYSTEM_MANAGER.create_column_family('Narrative','Twitter_search_test',
comparator_type=UTF8Type, )
- Here I want to specify, Column data types and row key type. How can
I do that ?
On Thu, Aug 15, 2013 at 12:30 PM, Tyler Hobbs wrote:
Thanks Artur,
You're right i must comment restore directory too.
Now i'll try to practice around restore.
Regards,
Bonnet Jonathan.
Very good point. I¹ve written code to do a very large number of inserts, but
I¹ve only ever run it on a single-node cluster. I may very well find out
when I run it against a multinode cluster that the performance benefits of
large unlogged batches mostly go away.
From: Sylvain Lebresne
Reply-To:
So, looking at the code:
public void maybeRestoreArchive()
{
if (Strings.isNullOrEmpty(restoreDirectories))
return;
for (String dir : restoreDirectories.split(","))
{
File[] files = new File(dir).listFiles();
if (files == null)
Hi all,
We're running into a weird problem trying to migrate our data from a
1.2.10 cluster to a 2.0.3 one.
I've taken a snapshot on the old cluster, and for each host there, I'm running
sstableloader -d KEYSPACE/COLUMNFAMILY
(the sstableloader process from the 2.0.3 distribution, the one from
1
Artur Kronenberg openmarket.com> writes:
>
>
> hi Bonnet,
> that doesn't seem to be a problem with your archiving, rather with
> the restoring. What is your restore command?
> -- artur
> On 11/12/13 13:47, Bonnet Jonathan. wrote:
>
>
>
Thanks to answear
I didn't do any warming up etc. I am new to Cassandra and was just
poking around with some scripts to try to find the fastest way to do
things. That said all the mini-tests ran under the same conditions.
In our case the batches will have a variable number of different
inserts/updates in them so do
On Wed, Dec 11, 2013 at 1:52 PM, Robert Wille wrote:
> Network latency is the reason why the batched query is fastest. One trip
> to Cassandra versus 1000. If you execute the inserts in parallel, then that
> eliminates the latency issue.
>
While it is true a batch will means only one client-serv
hi Bonnet,
that doesn't seem to be a problem with your archiving, rather with the
restoring. What is your restore command?
-- artur
On 11/12/13 13:47, Bonnet Jonathan. wrote:
Bonnet Jonathan externe.bnpparibas.com> writes:
>
>Thanks a lot,
>
>It Works, i see commit log bein archived.
Bonnet Jonathan externe.bnpparibas.com> writes:
>
> Thanks a lot,
>
>It Works, i see commit log bein archived. I'll try tomorrow the restore
> command. Thanks again.
>
> Bonnet Jonathan.
>
>
Hello,
I have restart a node today, and i have an error which seems to be in
relation with c
Network latency is the reason why the batched query is fastest. One trip to
Cassandra versus 1000. If you execute the inserts in parallel, then that
eliminates the latency issue.
From: Sylvain Lebresne
Reply-To:
Date: Wednesday, December 11, 2013 at 5:40 AM
To: "user@cassandra.apache.org"
S
I use hand-rolled batches a lot. You can get a *lot* of performance
improvement. Just make sure to sanitize your strings.
I¹ve been wondering, what¹s the limit, practical or hard, on the length of
a query?
Robert
On 12/11/13, 3:37 AM, "David Tinker" wrote:
>Yes thats what I found.
>
>This is f
Then I suspect that this is artifact of your test methodology. Prepared
statements *are* faster than non prepared ones in general. They save some
parsing and some bytes on the wire. The savings will tend to be bigger for
bigger queries, and it's possible that for very small queries (like the one
yo
Yes thats what I found.
This is faster:
for (int i = 0; i < 1000; i++) session.execute("INSERT INTO
test.wibble (id, info) VALUES ('${"" + i}', '${"aa" + i}')")
Than this:
def ps = session.prepare("INSERT INTO test.wibble (id, info) VALUES (?, ?)")
for (int i = 0; i < 1000; i++) session.execute
Hi Rahul,
thanks for replying. Could you please be a bit more specific, though. Eg
what exactly is being compacted - there is/was no data at all in the
cluster save for a few hundred kB in the system CF (see the nodetool status
output). Or - how can those few hundred kB in data generate Gb of netw
Sven
So basically when you run a repair you are essentially telling your cluster
to run a validation compaction, which generates a merkle tree on all the
nodes. These trees are used to identify the inconsistencies. So there is
quite a bit of streaming which you see as your network traffic.
Rahul
> This loop takes 2500ms or so on my test cluster:
>
> PreparedStatement ps = session.prepare("INSERT INTO perf_test.wibble
> (id, info) VALUES (?, ?)")
> for (int i = 0; i < 1000; i++) session.execute(ps.bind("" + i, "aa" + i));
>
> The same loop with the parameters inline is about 1300ms. It gets
Hi Maciej,
Thanks for sharing it.
On Wed, Dec 11, 2013 at 2:09 PM, Maciej Miklas wrote:
> Hi all,
>
> This is the Cassandra mailing list, but I've developed something that is
> strictly related to Cassandra, and some of you might find it useful, so
> I've decided to send email to this group.
Hi all,
This is the Cassandra mailing list, but I've developed something that is
strictly related to Cassandra, and some of you might find it useful, so
I've decided to send email to this group.
This is web based CQL3 editor. The idea is, to deploy it once and have
simple and comfortable CQL3 int
Hi,
What about using JBOD and replication factor 2?
Regards.
On 11 Dec 2013 02:03, "cem" wrote:
> Hi all,
>
> I need to setup 2 nodes Cassandra cluster. I know that Datastax
> recommends using JBOD as a disk configuration and have replication for the
> redundancy. I was planning to use RAID 10
52 matches
Mail list logo