and/or use the current tip of the cassandra-1.2 branch.
As I am on Debian packages, I'd rather not :-)
Thanks for the quick insights!
Jan
>
> --
> Sylvain
>
>
> On 19.12.2013, at 11:39, Sylvain Lebresne wrote:
>
> > https://issues.apache.org/jira/browse/CA
ear of these features
(e.g. manually disable paging when working with ranges and do it yourself).
Jan
> - Sanjeeth
tem be given its own cluster?
Or did I maybe miss another option?
Jan
Hi,
does anyone know of a C-driver that can be / has been used with nginx?
I am afraid that the C++ drivers[1] threading and connection pooling approach
interferes with nginx's threading model.
Doe anyone have any ideas?
Jan
[1] https://github.com/datastax/cpp-driver
On 15 Mar 2014, at 19:23, Joe Stein wrote:
> Here is an example wrapper how to use the DataStax java driver in scala
> https://github.com/stealthly/scala-cassandra
Thanks Joe,
yes, better to just use the java-driver than to create a generic scala-driver
it seems
ll tested.
Thank you for the details (and creating the driver in the first place!) -
Sounds convincing.
Jan
>
> yours
> Theo
>
>
> On Wed, Mar 19, 2014 at 5:21 PM, Theo Hultberg wrote:
> I'm the author of cql-rb, the first one on your list. It runs in production
&
and can suggest a ‘clever’ data model design
and interaction?
Jan
ter that can (potentially) do CAS.
Why would I set up a MySQL cluster to solve that problem?
And yeah, I could use a queue or redis or whatnot, but I want to avoid yet
another moving part :-)
Jan
>
>
> On Thu, Apr 3, 2014 at 11:44 PM, Jan Algermissen
> wrote:
> Hi,
>
> ma
Hi DuyHai,
On 04 Apr 2014, at 13:58, DuyHai Doan wrote:
> @Jan
>
> This subject of distributed workers & queues has been discussed in the
> mailing list many times.
Sorry + thanks.
Unfortunately, I do not want to use C* as a queue, but to coordinate workers
that page
Hi Duy Hai,
On 04 Apr 2014, at 20:48, DuyHai Doan wrote:
> @Jan
>
> Your use-case is different than what i though. So basically you have only
> one data source (the feed) and many consumers (the workers)
>
> Only one worker is allowed to consumer the feed at a time.
.
The parameters used for on disk storage are commitlog_directory and
data_file_directories and saved_caches_directory. The paramter
data_file_directories is in plural, you can easily put more than one
directory here (and you should do this instead of using RAID).
Cheers,
Jan
Am 07.04.2014 12:56
valid for SizeTieredCompation, Leveled-
or HybridCompations are "cheaper" on disk space.
For the last point - there are many tools to monitor your servers inside
your cluster. Nagios, Hyperic HQ and OpenNMS are some of them - you can
define alerts which keep you up to date.
Cheers,
jan
about 200g per node
- running a "nodetool repair -pr" on one of the nodes seems to run
forever, right now it's running for 2 complete days and does not return.
Any suggestions?
Thanks in advance,
Jan
solves the issue. Thanks for that hint!
Cheers,
Jan
Hi,
can anyone point me to recommendations for hosting and configuration
requirements when running a Production Cassandra Cluster at Rackspace?
Are there reference projects that document the suitability of Rackspace for
running a production Cassandra cluster?
Jan
modification is hitting the database?
Alternatively, what do others do to handle schema migrations during continuous
delivery processes.
Jan
Colin,
On 18 May 2014, at 15:29, Colin wrote:
> Hi Jan,
>
> Try waiting a period of time, say 60 seconds, after modifying the schema so
> the changes propagate throughout the cluster.
>
> Also, you could add a step to your automation where you verify the schema
> ch
On 18 May 2014, at 10:30, Jan Algermissen wrote:
> Hi,
>
> in our project, we apparently have a problem or misunderstanding of the
> relationship between schema changes and data updates.
>
> One team is doing automated tests during build and deployment that executes
>
sword: cassandra
Now the question that puzzels me. If I disable encryption and start both
nodes the join each other an I have a working cluster. If I enable
encryption they do not join any longer and I have to seperate nodes.
Any hints?
Thanks,
Jan
Hi,
i tried to scrub the keyspace - but with no success either, the process
threw an exception when hitting the corrupt block and stopped then. I
will rebootstrap the node :-)
Thanks anyways,
Jan
On 03.07.2013 19:10, Glenn Thompson wrote:
For what its worth. I did this when I had this
n some way - and a
scrub and repair should fix that I suppose.
Since the original cluster has a replication factor of 3 - shoudn't the
import from 5 of 6 snapshots contain all data? Or is the sstableloader
tool too clever and avoids importing double data?
Thanks for hints,
Jan
x27;t
I import two of the three replicas, so that the data is complete?
Von meinem iPhone gesendet
Am 18.07.2013 um 19:06 schrieb sankalp kohli :
> sstable might be corrupted due to bad disk. In that case, replication does
> not matter.
>
>
> On Thu, Jul 18, 2013 at 8:52 A
with exception but no hint which file was affected. So I
replayed the sstables one by one and finally found the corrupt one.
Thanks to all,
Jan
--
Jan Kesten, mailto:j.kes...@enercast.de
Tel.: +49 561/4739664-0 FAX: -9
enercast GmbH Friedrich-Ebert-Str. 104 D-34119 Kassel HRB15471
Hi,
I am Jan Algermissen (REST-head, freelance programmer/consultant) and
Cassandra-newbie.
I am looking at Cassandra for an application I am working on. There will be a
max. of 10 Million items (Texts and attributes of a retailer's products) in the
database. There will occasional writes
Hi,
second question:
is it recommended to set up Cassandra using 'RAID-ed' disks for per-node
reliability or do people usually just rely on having the multiple nodes anyway
- why bother with replicated disks?
Jan
nable?
How should I plan the disk sizes and number of CPU cores?
Are there any other configuration mistakes to avoid?
Is there online documentation that discusses such VM sizing questions in more
detail?
Jan
for all my keys (plus some overhead of
course).??
Jan
>
> Any memory not allocated to a process will generally be put to good use
> serving as page cache. See here: http://en.wikipedia.org/wiki/Page_cache
>
> Jon
>
>
> On Tue, Jul 30, 2013 at 10:51 PM, Jan Alge
Hi Shahab,
On 31.07.2013, at 15:59, Shahab Yunus wrote:
> Hi Jan,
>
> One question...you say
> "- I must make sure the disks are directly attached, to prevent
> problems when multiple nodes flush the commit log at the
> same time"
I read that using Cassandra
in a JPA integration)
Jan
SVN like the Wiki[1] says, but got compile
errors. Do you know what the quality of cassandra-jdbc is?
Jan
>
> Just my .2 cents worth
>
> Andy
>
>
> On 3 Aug 2013, at 08:28, Jan Algermissen wrote:
>
>> Hi,
>>
>> I plan on using Cassandra in a
Tony,
On 03.08.2013, at 16:36, Tony Anecito wrote:
> I use the DataStax driver anm happy with it so far.
Thanks. Also the question: are you talking about
https://github.com/datastax/java-driver or cassandra-jdbc?
Jan
> Also, think about if driver is being worked on as Cassandr
ing to
understand which driver will integrate with JavaEE in the most natural (aka
'standard') way.
Jan
> Its binary protocol allows multiplexing, is vnode token aware, and does not
> require serialize/deserialize to thrift. We used astyanax before and it did
> work
Hi,
I think it does not fit the model of how C* does writes, but just to verify:
Is there an update-in-place possibility on maps? That is, could I do an atomic
increment on a value in a map?
Jan
or to the row as a whole?
Jan
ted to make sure I do not read
anything into the docs that isn't there.
As for the atomic increment, I take the answer is 'no, there is no atomic
increment, I have to pull the value to the client and send an update with the
new value'.
Jan
>
> Alain
>
>
> 20
On 06.08.2013, at 11:36, Andy Twigg wrote:
> Store pointers to counters as map values?
Sorry, but this fits into nothing I know about C* so far - can you explain?
Jan
question: I read in a C* 1.1 related slidedeck that Hadoop output to
CFS is only possible with DSE and not with DSC - that with DSC the Hadoop
output would be HDFS. Is that correct? For homogeneity, I would certainly want
to store the output files in CFS, too.
Sorry, that this was a bit of a longer question/explanation.
Jan
acters (e.g. - , HAAA
to ...).
Makes sense?
Jan
>
> Cheers
>
>
>
> -
> Aaron Morton
> Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 10/08/2013, at 8:15 PM, Jan Algermissen wrote:
I'd be doing
SELECT * FROM users WHERE token(name) >
token(one-befoe-the-last-name-of-prev-result) LIMIT 100;
Question: Is that what I have to do or is there a way to make token() and limit
work together to return complete wide rows?
Jan
[1] token() and how it relates to paging is actually quite hard to grasp from
the docs.
do I have to handle this by
the application (e.g. doing delete and then write or coming up with a more
clever data model)?
Jan
one with a simple LIMIT 1 query. (I'll
explain more when this is an option for you).
Jan
P.S. Me being a REST/HTTP head, an alarm rings when I see 'version' next to
'mimetype' :-) What exactly are you versioning here? Maybe we can even change
the situation from a functional POV?
>
> Regards,
>
> Dawood
>
>
>
>
le (
file_id text,
version timestamp,
fname text,
PRIMARY KEY (file_id,version)
) WITH CLUSTERING ORDER BY (version DESC);
Get the latest file version with
select * from file where file_id = 'xxx' limit 1;
If it has to be an integer, use counter columns.
Jan
>
-driver-user/APfnKNTXuvE/gBeCk37jgRAJ>
Jan
>
> > From: jan.algermis...@nordsc.com
> > Subject: Update-Replace
> > Date: Fri, 30 Aug 2013 17:35:48 +0200
> > To: user@cassandra.apache.org
> >
> > Hi,
> >
> > I have a use case, where I periodi
tificially' to prevent the
loss of hosts - does that make sense?
Can anyone explain whether this is intended behavior, meaning I'll just have to
accept the self-shutdown of the hosts? Or alternatively, what data I should
collect to investigate the cause further?
Jan
The subject line isn't appropriate - the servers do not crash but shut down.
Since the log messages appear several lines before the end of the log file, I
only saw afterwards. Excuse the confusion.
Jan
On 04.09.2013, at 10:44, Jan Algermissen wrote:
> Hi,
>
> I have set u
d expect C* to sort of just suck in my rather small amount of data - must be
me, not using the right configuration. Oh well, I'll get there :-) Thanks
anyhow.
Jan
>
> Romain
>
INFO [ScheduledTasks:1] 2013-09-04 07:17:09,057 StatusLogger.java (line 96)
KeyCache
at java.lang.Thread.run(Thread.java:724)
ERROR [CompactionExecutor:5] 2013-09-06 11:02:28,685 CassandraDaemon.java (line
192) Exception in thread Thread[CompactionExecutor:
On the other hosts the log looks similar, but these keep running, desipte the
OutOfMemory Errors.
Jan
>
>
monitor to see the
development of the causes of the out of memory error and what other switches I
should try out?
Jan
[1]
# emergency pressure valve: each time heap usage after a full (CMS)
# garbage collection is above this fraction of the max, Cassandra will
# flush the largest memtables
-20 consecutive rows
pertaining to the same wide row. Mabe that is causing too frequent compactions
with high volume?.
Jan
On 06.09.2013, at 17:07, Jan Algermissen wrote:
>
> On 06.09.2013, at 13:12, Alex Major wrote:
>
>> Have you changed the appropriate config settings so that Cassandra will run
>> with only 2GB RAM? You shouldn't find the nodes go down.
>>
>
when
memtables are build up and then get flushed. But what about that third node?
Has anyone seen something similar?
Jan
C* dsc 2.0 , 3x 4GB, 2CPU nodes with heavy writes of 70 col-rows (aprox 10 of
those rows per wide row)
I have turned off caches, reduced overall memtable and set flush
imal
memory I need for that cannot be tuned down but depends on the size of the
stuff written to C*. (Due to C* doing its memtable magic) to save using
sequential IO.
It is an interesting trade off. (if I get it right by now :-)
Jan
>
> On Friday, September 6, 2013, Jan Algerm
product of the SEDA
architecture, though.
I switched back from hsha to sync and increased memtable max size and heap.
That did the trick. Now it flies.
Jan
>
> Like you, doing testing with heavy writes.
>
> I was using a python client to drive the writes using the cql module whi
organization works
(caveat GC).
Cassandra takes this eagerness for consuming writes and organizing the writes
in memory to such an extreme, that any given node will rather die than stop
consuming writes.
Especially I am looking a confirmation of the last one.
Jan
is always this segment of an arc which
shows the increasing unflushed memtables.
Jan
ow-up requests that
still time out.
Any idea how to approach this problem?
Jan
o really investigate C* behavior (e.g.
nodes talking to each other, replication, CAS)
- Check whether your VMs have the storage directly attached (unlikely) or
whether they share with other VMs (which isn't optimal)
HTH,
Jan
>
> Mostly this instance runs smoothly but runs low on memory. D
Seems I am impacted by a paging bug in C* 2.0.0 and need to go to 2.0.1.
Is there any estimate for when the (Centos 6/64bit) RPM will be updated?
I'd rather wait than change installation procedure - but it depends on the
timeline for me.
Jan
ou considered the new build-in paging support:
http://www.datastax.com/dev/blog/client-side-improvements-in-cassandra-2-0
Jan
>
> select * from log where token(mykey) > token(maxTimeuuid(x)) limit 100;
>
> (where xxx is 0 for the first query, and next one to be the time of the
Maybe you are hitting the problem that your 'pages' can get truncated in the
middle of a wide row.
See
https://groups.google.com/a/lists.datastax.com/d/msg/java-driver-user/lHQ3wKAZgM4/DnlXT4IzqsQJ
Jan
On 01.10.2013, at 18:12, Jimmy Lin wrote:
> unfortunately, i have to stick
can enable the row cache
for the seldom changing, larger sized portion of the data?
Or would the effect likely by rather marginal?
Jan
r restriction, C* would have to do a full
scan over all rows to find those rows that have a 'myvalue' C-part.
Jan
> My aim is to query for more than one value in the c column. Is this supported?
>
> Thanks,
> Petter
>
>
>
>
>
you can't use them in this way?
What you have in your example is parts of a primary key.
Secondary are defined in a different way (I never use them because they sort of
goes against the whole point of using C* IMHO so I don't know right now how to
do it. check the docs)
Jan
>
Then I am curious, too what the intended behaviour is.
Jan
> I might end up not using secondary indexes but since the feature is there and
> I need the functionality I would like to know its limitations and if the
> behaviour I am experiencing is a problem with my design, a problem wit
On Sep 7, 2010, at 11:05 AM, Ying Tang wrote:
Sorry , i didn't put it clearly.
The app throws out the TimeoutException ,
but the cassandra throws out the ArrayIndexOutOfBoundsException.
And if i shortened this key's length,such as one letter , the indexed
column insert is successful.
But if
Hi
I need to use the low level java API in order to test bulk ingestion
to cassandra. I have already looked at the code in contrib/
bmt_example and contrib/client_only.
When I try and run the following code, I get following exception ;
using cassandra-cli I am able to see the "Keyspace1
Thanks
once I am using cli script, it is able to connect to the local server
and see all keyspaces etc. Do I still need to load schemas when using
the same local server ?
Asif
On Sep 7, 2010, at 8:58 PM, Peter Harrison wrote:
On Wed, Sep 8, 2010 at 3:20 AM, Asif Jan wrote:
Hi
I need
On Sep 7, 2010, at 8:58 PM, Peter Harrison wrote:
On Wed, Sep 8, 2010 at 3:20 AM, Asif Jan wrote:
Hi
I need to use the low level java API in order to test bulk
ingestion to
cassandra. I have already looked at the code in contrib/bmt_example
and
contrib/client_only.
When I try and run
Hi
did you register cassandra jars with Pig. This seems like a class
loading problem.
aj
On Sep 10, 2010, at 3:38 AM, Mark wrote:
Does anyone know of any good tutorials for using Pig with Cassandra?
I am trying do a basic load:
rows = LOAD 'cassandra://Foo/Bar' USING CassandraStorage();
Hi
I am using 0.7.0-beta1 , and trying to get the contrib/client_only
example to work.
I am running cassandra on host1, and trying to access it from host2.
When using thirft (via cassandra-cli) and in my application; I am able
to connect and do all operations as expected.
But I am not a
earlier this week when
I tried using it. It needs some fixes to keep up with all the 0.7
changes.
Gary.
On Thu, Sep 16, 2010 at 05:48, Asif Jan wrote:
Hi
I am using 0.7.0-beta1 , and trying to get the contrib/client_only
example
to work.
I am running cassandra on host1, and trying to access it
wn] use twitter;
Keyspace 'twitter' not found.
So we can't create the keyspace. The cassandra version is 1.1.0.
What is the proper way to deal with this?
Thanks for your help.
Arend-Jan
--
Arend-Jan Wijtzes -- Wiseguys -- www.wise-guys.nl
are there references in the
tables to the keyspace name?
On Wed, Aug 08, 2012 at 03:03:52PM +0200, Arend-Jan Wijtzes wrote:
> Hi,
>
> Today we rebooted a node in our cluster for maintenance and after that
> one of the keyspaces went missing. This is what we did leading up to
> thi
On Wed, Aug 08, 2012 at 05:08:56PM +0200, Mateusz Korniak wrote:
> On Wednesday 08 of August 2012, Arend-Jan Wijtzes wrote:
> > Forgot to mention that the keyspace 'twitter' was created, then droppend
> > and re-created a couple of days ago.
> >
> > How abou
Hi,
We are running Cassandra 1.1.4 and like to experiment with
Datastax Enterprise which uses 1.0.8. Can we safely downgrade
a production cluster or is it incompatible? Any special steps
involved?
Arend-Jan
--
Arend-Jan Wijtzes -- Wiseguys -- www.wise-guys.nl
On Thu, Sep 20, 2012 at 10:13:49AM +1200, aaron morton wrote:
> No.
> They use different minor file versions which are not backwards compatible.
Thanks Aaron.
Is upgradesstables capable of downgrading the files to 1.0.8?
Looking for a way to make this work.
Regards,
Arend-Jan
>
rite implements full usage of
https://github.com/datastax/spark-cassandra-connector and brings much of
it's goodness to PySpark!
Hope that some of you are able to put this to good use. And feedback, pull
requests, etc. are more than welcome!
Best regards,
Frens Jan
ing a repartition is way to expensive as I just want more
partitions for parallelism, not reshuffle ...
Thanks in advance!
Frens Jan
2.1.2 on 64bit fedora 21 with Oracle
Java 1.8.0_31.
Thanks in advance,
Frens Jan
March 2015 at 18:10, DuyHai Doan wrote:
> First idea to eliminate any issue with regards to staled data: issue the
> same count query with RF=QUORUM and check whether there are still
> inconsistencies
>
> On Tue, Mar 10, 2015 at 9:13 AM, Rumph, Frens Jan
> wrote:
>
&g
77 to -594461978511041000 were
included. In a case which yielded much more partition keys, the entire
token range did seem to be queried.
To reiterate my initial questions: is this behavior to be expected? Am I
doing something wrong? Is there a workaround?
Best regards,
Frens Jan
On 4 March
moving average, I have to deal with data all
over the place. I can't currently think of anything but performing
aggregateByKey causing a shuffle every time.
Anyone have experience with combining time series chunking and computation
on all / many time series at once? Any advice?
Cheers,
Frens Jan
llent FC SAN for storage
Cassandra: 3-6 node 2vCpu Centos guest boxes (RF=2)
Jan-Taeke Schuilenga
Which variables (for instance: throughput, CPU, I/O, connections) are
leading in deciding to add a node to a Cassandra setup which is put
under strain. We are trying to proove scalibility, but when is the time
there to add a node and have the optimum scalibilty result.
25.00%
127605887595351923798765477786913079296
This has been going on for days. Note that it's just two key's in the log
that keep repeating. No recent messages about HintedHandOff in the logs
on the other nodes.
Let me know if you need more info.
Arend-Jan
101 - 185 of 185 matches
Mail list logo