To add onto this message:
Queries are all on the partition key (select
origvalue,ingestdate,mediatype from doc.origdoc where uuid=?). Queries
were very fast when the table was <10 million rows.
Table description:
describe doc.origdoc;
CREATE TABLE doc.origdoc (
uuid text,
ingestdat
1. Is it consistently taking that long?
2. Have you traced the requests?
3. Are you watching your GC history?
4. What's the load on the machine? Does dstat show high CPU or disk
utilization?
I did a webinar about a year ago on how to dig into these issues, you may
find it useful: https://www.yout
Read requires avg. 6 sstables and my read latency is 42 ms. so on avg. we
can say Cassandra is taking 7ms to process data from one sstable *which is
entirely in memory*. I think there is something wrong here. If we go with
this math then we can say Cassandra latency would be always > 7ms for most
o
Maybe compaction not keeping up - since you are hitting so many sstables?
Read heavy... are you using LCS?
Plenty of resources... tune to increase memtable size?
On Sat, Sep 26, 2015 at 9:19 AM, Eric Stevens wrote:
> Since you have most of your reads hitting 5-8 SSTables, it's probably
> relat
Since you have most of your reads hitting 5-8 SSTables, it's probably
related to that increasing your latency. That makes this look like your
write workload is either overwrite-heavy or append-heavy. Data for a
single partition key is being written to repeatedly over long time periods,
and this w
Please find histogram attached.
On Fri, Sep 25, 2015 at 12:20 PM, Ryan Svihla wrote:
> if everything is in ram there could be a number of issues unrelated to
> Cassandra and there could be hardware limitations or contention problems.
> Otherwise cell count can really deeply impact reads, all ram
if everything is in ram there could be a number of issues unrelated to
Cassandra and there could be hardware limitations or contention problems.
Otherwise cell count can really deeply impact reads, all ram or not, and some
of this is because of the nature of GC and some of it is the age of the s
I understand that but everything is in RAM (my data dir is tmpfs) and my
row is not that wide approx. less than 5MB in size. So my question is if
everything is in RAM then why does it take 43ms latency?
On Fri, Sep 25, 2015 at 7:54 AM, Ryan Svihla wrote:
> if you run:
>
> nodetool cfhistograms
if you run:
nodetool cfhistograms
On the given table and that will tell you how wide your rows are getting. At
some point you can get wide enough rows that just the physics of retrieving
them all take some time.
> On Sep 25, 2015, at 9:21 AM, sai krishnam raju potturi
> wrote:
>
> Jayde
Jaydeep; since your primary key involves a clustering column, you may be
having pretty wide rows. The read would be sequential. The latency could be
acceptable, if the read were to involve really wide rows.
If your primary key was like ((a,b)) without the clustering column, it's
like reading a key
]
Envoyé : mardi 22 septembre 2015 19:50
À : user@cassandra.apache.org
Objet : Re: High read latency
select * from test where a = ? and b = ?
On Tue, Sep 22, 2015 at 10:27 AM, sai krishnam raju potturi
mailto:pskraj...@gmail.com>> wrote:
thanks for the information. Posting the query too wo
select * from test where a = ? and b = ?
On Tue, Sep 22, 2015 at 10:27 AM, sai krishnam raju potturi <
pskraj...@gmail.com> wrote:
> thanks for the information. Posting the query too would be of help.
>
> On Tue, Sep 22, 2015 at 11:56 AM, Jaydeep Chovatia <
> chovatia.jayd...@gmail.com> wrote:
>
thanks for the information. Posting the query too would be of help.
On Tue, Sep 22, 2015 at 11:56 AM, Jaydeep Chovatia <
chovatia.jayd...@gmail.com> wrote:
> Please find required details here:
>
> - Number of req/s
>
> 2k reads/s
>
> - Schema details
>
> create table test {
>
>
Please find required details here:
- Number of req/s
2k reads/s
- Schema details
create table test {
a timeuuid,
b bigint,
c int,
d int static,
e int static,
f int static,
g int static,
h int,
i text,
j text,
k text,
l text,
m set
n bigint
o bigint
p bigint
q
Hi,
Before speaking about tuning, can you provide some additional information ?
- Number of req/s
- Schema details
- JVM settings about the heap
- Execution time of the GC
43ms for a read latency may be acceptable according to the number of request
per s
There's likely 2 things occurring
1) the cfhistograms error is due to
https://issues.apache.org/jira/browse/CASSANDRA-8028
Which is resolved in 2.1.3. Looks like voting is under way for 2.1.3. As
rcoli mentioned, you are running the latest open source of C* which should
be treated as beta until a
Hi there,
The compaction remains running with our workload.
We are using SATA HDDs RAIDs.
When trying to run cfhistograms on our user_data table, we are getting
this message:
nodetool: Unable to compute when histogram overflowed
Please see what happens when running some queries on this cf:
http:
Hello
You may not be experiencing versioning issues. Do you know if compaction is
keeping up with your workload? The behavior described in the subject is
typically associated with compaction falling behind or having a suboptimal
compaction strategy configured. What does the output of nod
an Tarbox [mailto:briantar...@gmail.com]
Sent: Friday, January 9, 2015 8:56 AM
To: user@cassandra.apache.org
Subject: Re: High read latency after data volume increased
C* seems to have more than its share of "version x doesn't work, use version y
" type issues
On Thu, Jan 8, 2015
C* seems to have more than its share of "version x doesn't work, use
version y " type issues
On Thu, Jan 8, 2015 at 2:23 PM, Robert Coli wrote:
> On Thu, Jan 8, 2015 at 11:14 AM, Roni Balthazar
> wrote:
>
>> We are using C* 2.1.2 with 2 DCs. 30 nodes DC1 and 10 nodes DC2.
>>
>
> https://eng
On Thu, Jan 8, 2015 at 6:38 PM, Roni Balthazar
wrote:
> We downgraded to 2.1.1, but got the very same result. The read latency is
> still high, but we figured out that it happens only using a specific
> keyspace.
>
Note that downgrading is officially unsupported, but is probably safe
between tho
Hi Robert,
We downgraded to 2.1.1, but got the very same result. The read latency is
still high, but we figured out that it happens only using a specific
keyspace.
Please see the graphs below...
Trying another keyspace with 600+ reads/sec, we are getting the acceptable
~30ms read latency.
Let
On Thu, Jan 8, 2015 at 11:14 AM, Roni Balthazar
wrote:
> We are using C* 2.1.2 with 2 DCs. 30 nodes DC1 and 10 nodes DC2.
>
https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
2.1.2 in particular is known to have significant issues. You'd be better
off running 2.1.1 ...
> We are using Leveled Compaction with Cassandra 1.2.5. Our sstable size is
> 100M. On each node,
> we have anywhere from 700+ to 800+ sstables (for all levels). The
> bloom_filter_fp_chance is set at 0.000744.
The current default bloom_filter_fp_chance is 0.1 for levelled compaction.
Reducing t
Try doing request tracing.
http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2
On Thu, Jun 27, 2013 at 2:40 PM, Bao Le wrote:
> Hi,
>
> We are using Leveled Compaction with Cassandra 1.2.5. Our sstable size
> is 100M. On each node,
> we have anywhere from 700+ to 800+ sstables (for al
> FlushWriter 0 0 8252 0
>299
If you are not suffering from gc pressure/pauses (possibly not, because you
don't seem to have a lot of read failures in tpstats or outlier latency on the
histograms), then the flush writer errors are s
300 GB is a lot of data for cloud machines (especially with their
weaker performance in general). If you are unhappy with performance
why not scale the cluster out to more servers, with that much data you
are usually contending with the physics of spinning disks. Three nodes
+ replication factor 3
It's really a pain to modify the data model, the problem is how to
handle "one-to-many" relation in cassandra? The limitation of the row
size will lead to impossible to store them with columns.
On Fri, Jun 4, 2010 at 4:13 PM, Sylvain Lebresne wrote:
> As written in the third point of
> http://w
As written in the third point of
http://wiki.apache.org/cassandra/CassandraLimitations,
right now, super columns are not indexed and deserialized fully when you access
them. Another way to put it is, you'll want to user super columns with
only a relatively
small number of columns in them.
Because i
29 matches
Mail list logo