I am using Pig with Cassandra (Cassandra 2.1.2, Pig 0.14, Hadoop 2.6.0
combo).
When I use CqlStorage() I get
org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
org.apache.cassandra.exceptions.ConfigurationException: Unable to find
inputformat class 'org.apache.cassandra.hadoop.cql3
Hi,
I am using incremental repair in Cassandra 2.1.2 right now, I am wondering if
there is any API that I can get the current progress of the current repair job?
That would be a great help. Thanks.
Regards,
-Jieming-
Yep, you may register and log into the Apache JIRA and click "Vote for this
issue", in the upper right-side of the ticket.
On Wed, Jan 21, 2015 at 11:30 PM, Ian Rose wrote:
> Ah, thanks for the pointer Philip. Is there any kind of formal way to
> "vote up" issues? I'm assuming that adding a co
@jack thanks for taking time to respond. I agree I could totally redesign
and rewrite it to fit in the newer CQL3 model, but are you really
recommending I throw 4 years of work out and completely rewrite code that
works and has been tested?
Ignoring the practical aspects for now and exploring the
In last year's summit there was a presentation from Instaclustr -
https://www.instaclustr.com/meetups/presentation-by-ben-bromhead-at-cassandra-summit-2014-san-francisco/.
It could be the solution you are looking for. However I don't see the code
being checked in or JIRA being created. So for now y
Ian,
Leaving a comment explaining your situation and how, as an operator of a
Cassandra Cluster, this would be valuable, would probably help most.
On Thu, Jan 22, 2015 at 6:06 AM, Paulo Ricardo Motta Gomes <
paulo.mo...@chaordicsystems.com> wrote:
> Yep, you may register and log into the Apache
I’m not sure where to send “faults” for the DataStax Devcenter so I’ll send
them here. If I define a UDT such as:
CREATE TYPE if not exists sensorsync.SensorReading (
fValue float,
sValue text,
iValue int
);
and a table
Create table if not exists sensorsync.Sensors(
name uuid,
insertion_tim
Thanks for the feedback Andy. I'll forward this to the DevCenter team.
Currently we have an email for sending feedback our way:
devcenter-feedb...@datastax.com. And the good news is that in the next
release there will be an integrated feedback form directly in DevCenter.
On Thu, Jan 22, 2015 at 8
I have been searching all over documentation but could not find an straight
answer.
For a project I'm using a single node cassandra database (so far)... It has
always worked well, but I'm reading everywhere that I should do a nodetool
repair at least every week, especially if I delete rows, which
On Thu, Jan 22, 2015 at 9:36 AM, SEGALIS Morgan wrote:
> So I wondered, does a nodetool repair make the server stop serving
> requests, or does it just use a lot of ressources but still serves request ?
>
In pathological cases, repair can cause a node to seriously degrade. If you
are operating c
I don't think you can do nodetool repair on a single node cluster.
Still, one day or another you'll have to reboot your server, at which point
your cluster will be down. If you want high availability, you should use a
3 nodes cluster with RF = 3.
On 22 January 2015 at 18:10, Robert Coli wrote:
Running a 'nodetool repair' will 'not' bring the node down.
Your question: does a nodetool repair make the server stop serving requests, or
does it just use a lot of ressources but still serves request
Answer: NO, the server will not stop serving requests. It will use
some resource
what do you mean by "operating correctly" ?
I only use dynamic columns if that helps...
2015-01-22 19:10 GMT+01:00 Robert Coli :
> On Thu, Jan 22, 2015 at 9:36 AM, SEGALIS Morgan
> wrote:
>
>> So I wondered, does a nodetool repair make the server stop serving
>> requests, or does it just use a l
If I change the network topology, I have to run repair right before adding
a new cluster ?
I know that I should put 2 more nodes, so far, I'm preparing myself to
create a new node on a new DC, but on same network (ping really low), so at
least I would have a backup server if anything happens, just
On Thu, Jan 22, 2015 at 10:22 AM, Jan wrote:
> Running a 'nodetool repair' will 'not' bring the node down.
It's not something that happens during normal operation. If something
goes sideways, and the resource usage climbs, a repair can definitely
cripple a node.
> Your question:
> does a node
Thanks, this is a straight forward answer, exactly what I needed !
2015-01-22 19:22 GMT+01:00 Jan :
> Running a 'nodetool repair' will 'not' bring the node down.
>
> Your question:
> does a nodetool repair make the server stop serving requests, or does it
> just use a lot of ressources but sti
On Thu, Jan 22, 2015 at 10:53 AM, SEGALIS Morgan wrote:
> what do you mean by "operating correctly" ?
>
I mean that if you are operating near failure, repair might trip a node
into failure. But if you are operating correctly, repair should not.
=Rob
Don't think it is near failure, it uses only 3% of the CPU and 40% of the
RAM if that is what you meant.
2015-01-22 19:58 GMT+01:00 Robert Coli :
> On Thu, Jan 22, 2015 at 10:53 AM, SEGALIS Morgan
> wrote:
>
>> what do you mean by "operating correctly" ?
>>
>
> I mean that if you are operating n
I have a column family that store articles. I'll need to get those articles
from the most recent to the oldest, getting them from Country, and of
course the ability to limit the number of fetched articles.
I though about another ColumnFamily "ArticlesByDateAndCountry" with dynamic
columns
The Key
Sorry, I copied/pasted the question from another platform where you don't
generally say hello,
So : Hello everyone,
2015-01-22 20:19 GMT+01:00 SEGALIS Morgan :
> I have a column family that store articles. I'll need to get those
> articles from the most recent to the oldest, getting them from C
Hello Morgan
The data model looks reasonable. Bucketing by day will help you to scale.
The only thing I can see is how to go back in time to fetch articles from
previous buckets (previous days). It is possible to have 0 article for a
country for a day ?
On Thu, Jan 22, 2015 at 8:23 PM, SEGALIS
Hi DuyHai,
if there is 0 article, the row will obviously not exist I guess... (no
article insertion will create the row)
What is bugging you exactly ?
2015-01-22 20:33 GMT+01:00 DuyHai Doan :
> Hello Morgan
>
> The data model looks reasonable. Bucketing by day will help you to scale.
> The only
well, if the current day bucket does not contain enough article, you may
need to search back in the previous day. If the previous day does not have
any article, you may need to go back time a day before ... and so on ...
Of course it's a corner case but I've seen some code that misses this
scenar
Oh yeah, I though about it, even raised the reflexion on the first mail,
"Let's say I want to show only 100 of the newer articles, I'll get the
today's articles, and if it does not fill the request (too few articles),
I'll check the day before that, etc..."
but your answer raised another issue I
You get it :D
This is the real issue. However it's quite an extreme case. If you can
guarantee that there will be a minimum X articles per day and per country,
the maximum number of request to fetch 100 articles will be bounded.
Furthermore, do not forget that SELECT statement using a partition
Usually this is about tuning, and this isn't an uncommon situation for new
users.
Potential steps to take
1) reduce stream throughput to a point that your cluster can handle it.
This is probably your most important tool. The default throughput depending
on version is 200mb or 400mb, go ahead and
Hi,
I increased range timeout, read timeout to first to 50 secs then 500 secs and
Astyanax client to 60, 550 secs respectively. I still get timeout exception.
I see the logic with .withCheckpointManager() code, is that the only way it
could work?
From: Eric Stevens [mailto:migh...@gmail.com]
Se
Hi,
I want to trigger just tombstone compaction after gc grace seconds is completed
not nodetool compact keyspace column family.
Anyway I can do that?
Thanks
There are some values for read timeout in Cassandra.yaml file and the
default value is 3 ms change to a bigger value and that resolved our
issue.
Hope this helps
Regards
Asit
On Jan 22, 2015 8:36 AM, "Neha Trivedi" wrote:
>
> Hello All,
> I am trying to process 200MB file. I am getting follo
What is the average and max # of CQL rows in each partition? Is 800,000 the
number of CQL rows or Cassandra partitions (storage engine rows)?
Another option you could try is a CQL statement to fetch all partition keys.
You could first try this in the cqlsh:
“SELECT DISTINCT pk1, pk2…pkn FROM CF
On Thu, Jan 22, 2015 at 4:19 PM, Asit KAUSHIK
wrote:
> There are some values for read timeout in Cassandra.yaml file and the
> default value is 3 ms change to a bigger value and that resolved our
> issue.
>
Having to increase this value is often a strong signal you are Doing It
Wrong. FWIW!
The method
com.google.common.collect.Sets.newConcurrentHashSet()Ljava/util/Set;
should be available in guava from 15.0 on. So guava-16.0 should be fine.
It's possible guava is being picked up from somewhere else? have a
global classpath variable?
you might want to do
URL u = YourClass.getRe
I agree with Rob. You shouldn't need to change the read timeout.
We had similar issues with intermittent ReadTimeoutExceptions for a while
when we ran Cassandra on underpowered nodes on AWS. We've also seen them
when executing unconstrained queries with very large ResultSets (because it
takes long
Hello Everyone,
Thanks very much for the input.
Here is my System info.
1. I have single node cluster. (For testing)
2. I have 4GB Memory on the Server and trying to process 200B. ( 1GB is
allocated to Tomcat7, 1 GB to Cassandra and 1 GB to ActiveMQ. Also nltk
Server is running)
3. We are using 2.
In each partition cql rows on average is 200K. Max is 3M.
800K is number of cassandra partitions.
From: Mohammed Guller [mailto:moham...@glassbeam.com]
Sent: Thursday, January 22, 2015 7:43 PM
To: user@cassandra.apache.org
Subject: RE: Retrieving all row keys of a CF
What is the average and max
35 matches
Mail list logo